是否有允许搜索一系列键的 julia 结构?
Is there a julia structure allowing to search for a range of keys?
我知道我可以将整数键用于散列图,如以下字典示例。但是字典是无序的,不能从整数键中受益。
julia> hashmap = Dict( 5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
Dict{Int64,String} with 4 entries:
9 => "nine"
16 => "sixteen"
70 => "seventy"
5 => "five"
julia> hashmap[9]
"nine"
julia> hashmap[8:50] # I would like to be able to do this to get keys between 8 and 50 (9 and 16 here)
ERROR: KeyError: key 8:50 not found
Stacktrace:
[1] getindex(::Dict{Int64,String}, ::UnitRange{Int64}) at ./dict.jl:477
[2] top-level scope at REPL[3]:1
我正在寻找一种有序结构,允许访问特定范围内的所有键,同时受益于排序键带来的性能优化。
我认为标准库中没有这样的结构,但这可以作为普通字典上的函数来实现,只要键的类型适合范围的选择:
julia> d = Dict(1 => "a", 2 => "b", 5 => "c", 7 => "r", 9 => "t")
Dict{Int64,String} with 5 entries:
7 => "r"
9 => "t"
2 => "b"
5 => "c"
1 => "a"
julia> dictrange(d::Dict, r::UnitRange) = [d[k] for k in sort!(collect(keys(d))) if k in r]
dictrange (generic function with 1 method)
julia> dictrange(d, 2:6)
2-element Array{String,1}:
"b"
"c"
有一个名为DataStructures
的专用库,它有一个SortedDict
结构和相应的搜索函数:
using DataStructures
d = SortedDict(5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
st1 = searchsortedfirst(d, 8) # index of the first key greater than or equal to 8
st2 = searchsortedlast(d, 50) # index of the last key less than or equal to 50
现在:
julia> [(k for (k,v) in inclusive(d,st1,st2))...]
3-element Array{Int64,1}:
9
16
get
允许你在定义none时有一个默认值,你可以默认为缺失然后跳过它们
julia> hashmap = Dict( 5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
Dict{Int64,String} with 4 entries:
9 => "nine"
16 => "sixteen"
70 => "seventy"
5 => "five"
julia> get.(Ref(hashmap), 5:10, missing)
6-element Array{Union{Missing, String},1}:
"five"
missing
missing
missing
"nine"
missing
julia> get.(Ref(hashmap), 5:10, missing) |> skipmissing |> collect
2-element Array{String,1}:
"five"
"nine"
如果您使用的是日期,您可以考虑查看 TimeSeries
包,只要您的整数键代表日期,它就可以满足您的需求:
using TimeSeries
dates = [Date(2020,11,5), Date(2020,11,9), Date(2020,11,16), Date(2020,11,30)]
times = TimeArray(dates, ["five", "nine", "sixteen", "thirty"])
然后:
times[Date(2020,11,8):Day(1):Date(2020,11,20)]
2×1 TimeArray{String,1,Date,Array{String,1}} 2020-11-09 to 2020-11-16
│ │ A │
├────────────┼───────────┤
│ 2020-11-09 │ "nine" │
│ 2020-11-16 │ "sixteen" │
我知道我可以将整数键用于散列图,如以下字典示例。但是字典是无序的,不能从整数键中受益。
julia> hashmap = Dict( 5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
Dict{Int64,String} with 4 entries:
9 => "nine"
16 => "sixteen"
70 => "seventy"
5 => "five"
julia> hashmap[9]
"nine"
julia> hashmap[8:50] # I would like to be able to do this to get keys between 8 and 50 (9 and 16 here)
ERROR: KeyError: key 8:50 not found
Stacktrace:
[1] getindex(::Dict{Int64,String}, ::UnitRange{Int64}) at ./dict.jl:477
[2] top-level scope at REPL[3]:1
我正在寻找一种有序结构,允许访问特定范围内的所有键,同时受益于排序键带来的性能优化。
我认为标准库中没有这样的结构,但这可以作为普通字典上的函数来实现,只要键的类型适合范围的选择:
julia> d = Dict(1 => "a", 2 => "b", 5 => "c", 7 => "r", 9 => "t")
Dict{Int64,String} with 5 entries:
7 => "r"
9 => "t"
2 => "b"
5 => "c"
1 => "a"
julia> dictrange(d::Dict, r::UnitRange) = [d[k] for k in sort!(collect(keys(d))) if k in r]
dictrange (generic function with 1 method)
julia> dictrange(d, 2:6)
2-element Array{String,1}:
"b"
"c"
有一个名为DataStructures
的专用库,它有一个SortedDict
结构和相应的搜索函数:
using DataStructures
d = SortedDict(5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
st1 = searchsortedfirst(d, 8) # index of the first key greater than or equal to 8
st2 = searchsortedlast(d, 50) # index of the last key less than or equal to 50
现在:
julia> [(k for (k,v) in inclusive(d,st1,st2))...]
3-element Array{Int64,1}:
9
16
get
允许你在定义none时有一个默认值,你可以默认为缺失然后跳过它们
julia> hashmap = Dict( 5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
Dict{Int64,String} with 4 entries:
9 => "nine"
16 => "sixteen"
70 => "seventy"
5 => "five"
julia> get.(Ref(hashmap), 5:10, missing)
6-element Array{Union{Missing, String},1}:
"five"
missing
missing
missing
"nine"
missing
julia> get.(Ref(hashmap), 5:10, missing) |> skipmissing |> collect
2-element Array{String,1}:
"five"
"nine"
如果您使用的是日期,您可以考虑查看 TimeSeries
包,只要您的整数键代表日期,它就可以满足您的需求:
using TimeSeries
dates = [Date(2020,11,5), Date(2020,11,9), Date(2020,11,16), Date(2020,11,30)]
times = TimeArray(dates, ["five", "nine", "sixteen", "thirty"])
然后:
times[Date(2020,11,8):Day(1):Date(2020,11,20)]
2×1 TimeArray{String,1,Date,Array{String,1}} 2020-11-09 to 2020-11-16
│ │ A │
├────────────┼───────────┤
│ 2020-11-09 │ "nine" │
│ 2020-11-16 │ "sixteen" │