如何从此函数获得一致的 return 类型？

Question

有什么方法可以让下面的函数return成为一个一致的类型吗？我正在为 rhs.

的每个不同长度使用 Julia GLM (love it). I wrote a function that creates all of the possible regression combinations for a dataset. However, my current method of creating a @formula returns 一种不同的类型

using GLM

function compose(lhs::Symbol, rhs::AbstractVector{Symbol})
    ts = term.((1, rhs...))
    term(lhs) ~ sum(ts)
end

使用 @code_warntype 作为简单示例 return 以下

julia> @code_warntype compose(:y, [:x])
Variables
  #self#::Core.Compiler.Const(compose, false)
  lhs::Symbol
  rhs::Array{Symbol,1}
  ts::Any

Body::FormulaTerm{Term,_A} where _A
1 ─ %1 = Core.tuple(1)::Core.Compiler.Const((1,), false)
│   %2 = Core._apply(Core.tuple, %1, rhs)::Core.Compiler.PartialStruct(Tuple{Int64,Vararg{Symbol,N} where N}, Any[Core.Compiler.Const(1, false), Vararg{Symbol,N} where N])
│   %3 = Base.broadcasted(Main.term, %2)::Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple},Nothing,typeof(term),_A} where _A<:Tuple
│        (ts = Base.materialize(%3))
│   %5 = Main.term(lhs)::Term
│   %6 = Main.sum(ts)::Any
│   %7 = (%5 ~ %6)::FormulaTerm{Term,_A} where _A
└──      return %7

并检查几个不同输入的 return 类型：

julia> compose(:y, [:x]) |> typeof
FormulaTerm{Term,Tuple{ConstantTerm{Int64},Term}}

julia> compose(:y, [:x1, :x2]) |> typeof
FormulaTerm{Term,Tuple{ConstantTerm{Int64},Term,Term}}

我们看到，随着 rhs 长度的变化，return 类型也会发生变化。

我可以更改 compose 函数，使其始终 return 是同一类型吗？这真的不是什么大问题。为每个新数量的回归量进行编译只需要大约 70 毫秒。这真的更像是 "how can I improve my Julia skills?"

Answer 1

我认为您无法避免此处的类型不稳定，因为 ~ 期望 RHS 是 Term 或 Tuple 的 Terms。

但是，您支付的大部分编译成本是在 term.((1, rhs...)) 中，因为您调用编译成本高昂的广播。以下是如何以更便宜的方式做到这一点：

function compose(lhs::Symbol, rhs::AbstractVector{Symbol})
    term(lhs) ~ ntuple(i -> i <= length(rhs) ? term(rhs[i]) : term(1) , length(rhs)+1)
end

或（这有点慢但更像您的原始代码）：

function compose(lhs::Symbol, rhs::AbstractVector{Symbol})
    term(lhs) ~ map(term, (1, rhs...))
end

最后 - 如果你正在做这样的计算，也许你可以放弃使用公式界面，但直接输入 lm 或 glm 矩阵作为 RHS，在这种情况下它应该能够避免额外的编译成本，例如：

julia> y = rand(10);

julia> x = rand(10, 2);

julia> @time lm(x,y);
  0.000048 seconds (18 allocations: 1.688 KiB)

julia> x = rand(10, 3);

julia> @time lm(x,y);
  0.000038 seconds (18 allocations: 2.016 KiB)

julia> y = rand(100);

julia> x = rand(100, 50);

julia> @time lm(x,y);
  0.000263 seconds (22 allocations: 121.172 KiB)

如何从此函数获得一致的 return 类型？

How can I get a consistent return type from this function?

statistics

regression

glm

julia