是否有允许搜索一系列键的 julia 结构?

Is there a julia structure allowing to search for a range of keys?

我知道我可以将整数键用于散列图,如以下字典示例。但是字典是无序的,不能从整数键中受益。

julia> hashmap = Dict( 5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
Dict{Int64,String} with 4 entries:
  9  => "nine"
  16 => "sixteen"
  70 => "seventy"
  5  => "five"

julia> hashmap[9]
"nine"

julia> hashmap[8:50] # I would like to be able to do this to get keys between 8 and 50 (9 and 16 here)
ERROR: KeyError: key 8:50 not found
Stacktrace:
 [1] getindex(::Dict{Int64,String}, ::UnitRange{Int64}) at ./dict.jl:477
 [2] top-level scope at REPL[3]:1

我正在寻找一种有序结构,允许访问特定范围内的所有键,同时受益于排序键带来的性能优化。

我认为标准库中没有这样的结构,但这可以作为普通字典上的函数来实现,只要键的类型适合范围的选择:

julia> d = Dict(1 => "a", 2 => "b", 5 => "c", 7 => "r", 9 => "t")
Dict{Int64,String} with 5 entries:
  7 => "r"
  9 => "t"
  2 => "b"
  5 => "c"
  1 => "a"

julia> dictrange(d::Dict, r::UnitRange) = [d[k] for k in sort!(collect(keys(d))) if k in r]
dictrange (generic function with 1 method)

julia> dictrange(d, 2:6)
2-element Array{String,1}:
 "b"
 "c"

有一个名为DataStructures的专用库,它有一个SortedDict结构和相应的搜索函数:

using DataStructures
d = SortedDict(5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")

st1 = searchsortedfirst(d, 8)   # index of the first key greater than or equal to 8
st2 = searchsortedlast(d, 50)  # index of the last key less than or equal to 50

现在:

julia> [(k for (k,v) in inclusive(d,st1,st2))...]
3-element Array{Int64,1}:
  9
 16

get允许你在定义none时有一个默认值,你可以默认为缺失然后跳过它们

julia> hashmap = Dict( 5 => "five", 9 => "nine", 16 => "sixteen", 70 => "seventy")
Dict{Int64,String} with 4 entries:
  9  => "nine"
  16 => "sixteen"
  70 => "seventy"
  5  => "five"

julia> get.(Ref(hashmap), 5:10, missing)
6-element Array{Union{Missing, String},1}:
 "five"
 missing
 missing
 missing
 "nine"
 missing

julia> get.(Ref(hashmap), 5:10, missing) |> skipmissing |> collect
2-element Array{String,1}:
 "five"
 "nine"

如果您使用的是日期,您可以考虑查看 TimeSeries 包,只要您的整数键代表日期,它就可以满足您的需求:

using TimeSeries

dates = [Date(2020,11,5), Date(2020,11,9), Date(2020,11,16), Date(2020,11,30)]
times = TimeArray(dates, ["five", "nine", "sixteen", "thirty"])

然后:

times[Date(2020,11,8):Day(1):Date(2020,11,20)]
2×1 TimeArray{String,1,Date,Array{String,1}} 2020-11-09 to 2020-11-16
│            │ A         │
├────────────┼───────────┤
│ 2020-11-09 │ "nine"    │
│ 2020-11-16 │ "sixteen" │