减少 Julia 中生成器的内存分配

Reducing memory allocation of a generator in Julia

我正在尝试减少代码中内循环的内存分配。以下未按预期工作的部分。

using Random 
using StatsBase
using BenchmarkTools
using Distributions

a_dist = Distributions.DiscreteUniform(1, 99)
v_dist = Distributions.DiscreteUniform(1, 2)
population_size = 10000
population = [rand(a_dist, population_size) rand(v_dist, population_size)]


find_all_it3(f::Function, A) = (p[2] for p in eachrow(A) if f(p[1]))

@btime begin 
    c_pool = find_all_it3(x -> (x < 5), population)
    c_pool_dict = countmap(c_pool, alg=:dict)
end


@btime begin
    c_pool_indexes = findall(x -> (x < 5) ,  view(population, :, 1))
    c_pool_dict = countmap(population[c_pool_indexes, 2], alg=:dict)
end

我希望生成器 (find_all_it3) 不需要分配太多内存。 然而,根据 btime 输出,似乎每个循环都有一个分配。

  98.040 μs (10006 allocations: 625.64 KiB)
  18.894 μs (18 allocations: 11.95 KiB)

现在在我的场景中,findall 的速度和分配最终成为一个问题,因此我试图通过 generator/iterators 找到更好的替代方案,以便减少分配;有没有办法做到这一点?是否有可供考虑的选项?

我没有解释,但这是我做的一些测试的结果

  • 最佳时间是 view(population, :, 1) .< 5 (test4)
  • 使用 broadcast! 减少了一点分配 (test5)
  • 减少分配的最好方法是做你自己的循环(test6)
using BenchmarkTools
using StatsBase

population_size = 10000
population = [rand(1:99, population_size) rand(1:2, population_size)]

find_all_it(f::Function, A) = (p[2] for p in eachrow(A) if f(p[1]))

function test1(population)
    c_pool = find_all_it(x -> x < 5, population)
    c_pool_dict = countmap(c_pool, alg=:dict)
end

function test3(population)
    c_pool_indexes = findall(x -> x < 5,  view(population, :, 1))
    c_pool_dict = countmap(view(population,c_pool_indexes, 2), alg=:dict)
end

function test4(population)
    c_pool_indexes = view(population, :, 1) .< 5
    c_pool_dict = countmap(view(population,c_pool_indexes, 2), alg=:dict)
end

function test5(c_pool_indexes, population)
    broadcast!(<, c_pool_indexes, view(population, :, 1), 5)
    c_pool_dict = countmap(view(population,c_pool_indexes, 2), alg=:dict)
end

function test6(population)
    d = Dict{Int,Int}()
    for i in eachindex(view(population, :, 1))
        if population[i, 1] < 5
            d[population[i,2]] = 1 + get(d,population[i,2],0)
        end
    end
    return d
end

julia> @btime test1(population);
  68.200 μs (10004 allocations: 625.59 KiB)

julia> @btime test3(population);
  14.800 μs (14 allocations: 9.00 KiB)

julia> @btime test4(population);
  7.250 μs (8 allocations: 9.33 KiB)

julia> temp = zeros(Bool, population_size);

julia> @btime test5(temp, population);
  16.599 μs (5 allocations: 3.78 KiB)

julia> @btime test6(population);
  11.299 μs (4 allocations: 608 bytes)