对比错误:按因子分组,`formula()` 中的减号运算符停止工作

Error in contrasts: group by factor and the minus operator in `formula()` stops working

在一个因素上使用 group_by() 时会发生错误,即使此因素之后是 removed from the model using the minus operator (-)。我的激励示例:

library(tidyverse)
df = mtcars %>% mutate(am = factor(am))
fits = df %>%
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - am), .)) # Returns the error

给出以下错误消息:

Error in `contrasts<-`(`tmp`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

我得到同样的错误,如果我 filter() 而不是 group:

fit_am0 = df %>% 
  filter(am == 0) %>%
  lm(formula(mpg ~ . - am), .) # Returns the error

就好像 formula() 函数没有正确检测减号运算符 (- am) 时我试图删除的变量是一个因素,即两者的组合。这是我的猜测,因为以下示例可以正常运行:

fits = mtcars %>% # `am` is numeric
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - am), .)) # No error
fit_am0 = df %>%
  filter(am == 0) %>%
  select(-am) %>% # `am` removed prior to running model
  lm(formula(mpg ~ .), .) # No error
fits2 = mtcars %>% 
  mutate(vs = factor(vs)) %>% # A non-grouped factor, later removed
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - vs), .)) # No error

这是一个错误吗?还是我在激励示例中犯了错误?

我找到了解决办法。删除数据选项中的因素而不是公式选项中的因素,即 lm(formula = formula(mpg ~ .), data = select(., -am)).

library(tidyverse)
df = mtcars %>% mutate(am = factor(am))
fits = df %>%
  group_by(am) %>%
  do(fit = lm(
    formula(mpg ~ .), 
    select(., -am)
  )) # No error