为什么分配 Union{T, Missing} 的数组比分配 T 的数组慢一个数量级？

Question

分配 Union{T, Missing} 的数组在 Julia 中非常昂贵。有什么解决方法吗？

julia> @time Vector{Union{Missing, Int}}(undef, 10^7);
  0.031052 seconds (2 allocations: 85.831 MiB)

julia> @time Vector{Union{Int}}(undef, 10^7);
  0.000027 seconds (3 allocations: 76.294 MiB)

Answer 1

因为如果你用 Int 这样的位类型创建 Missing 的 Union，那么 Julia 会设置这样的标志，这样的向量最初会在其每个向量中存储 missing条目：

julia> Vector{Union{Missing, Int}}(undef, 10^7)
10000000-element Vector{Union{Missing, Int64}}:
 missing
 missing
 ⋮
 missing
 missing

如果您使用非位类型，则不必像您在此处看到的那样为每个条目设置这样的标志：

julia> Vector{Union{Missing, String}}(undef, 10^7)
10000000-element Vector{Union{Missing, String}}:
 #undef
 #undef
   ⋮
 #undef
 #undef

因此性能相同：

julia> @btime Vector{Union{String}}(undef, 10^7);
  11.672 ms (3 allocations: 76.29 MiB)

julia> @btime Vector{Union{Missing, String}}(undef, 10^7);
  11.480 ms (2 allocations: 76.29 MiB)

Answer 2

不同之处在于联合数组被零初始化。您可以在此处查看决定这一点的代码：

https://github.com/JuliaLang/julia/blob/3f024fd0ab9e68b37d29fee6f2a9ab19819102c5/src/array.c#L191

这最终调用了 memset:

https://github.com/JuliaLang/julia/blob/3f024fd0ab9e68b37d29fee6f2a9ab19819102c5/src/array.c#L144-L145

所以作为检查，我们可以比较 zeros 与分配联合数组：

julia> @time Vector{Union{Missing, Int}}(undef, 10^7);
  0.020609 seconds (2 allocations: 85.831 MiB)

julia> @time zeros(Int, 10^7);
  0.018375 seconds (2 allocations: 76.294 MiB)

相当可比的时间。

但是，我不认为这种性能差异最终会影响您的应用程序，除非您以一种非常奇怪的方式构建它。在分配时间变得微不足道之前，您可以对该数组做的工作很少。例如，仅设置未初始化数组的值使得时间与联合数组非常相似：

julia> function f()
           a = Vector{Int}(undef, 10^7)
           for i in eachindex(a)
               a[i] = 1
           end
           a
       end;

julia> function f_union()
           a = Vector{Union{Missing, Int}}(undef, 10^7)
           for i in eachindex(a)
               a[i] = 1
           end
           a
       end;

julia> @time f();
  0.015566 seconds (2 allocations: 76.294 MiB)

julia> @time f_union();
  0.026414 seconds (2 allocations: 85.831 MiB)

Answer 3

我们遇到了同样的问题，作为解决方法我们使用了

x = Vector{Union{T,Missing}}(undef,1)
resize!(x, newlen)

为什么分配 Union{T, Missing} 的数组比分配 T 的数组慢一个数量级？

Why is allocating an array of Union{T, Missing} an order of magnitude slower than an array of T?

arrays

performance

julia