dplyr：group_by 和 group_by_ 函数有什么区别？

Question

我无法弄清楚 group_by_() 函数的基于下划线的函数是什么。

来自 group_by 帮助：

by_cyl <- group_by(mtcars, cyl)  
summarise(by_cyl, mean(disp), mean(hp))

产生预期的结果：

Source: local data frame [3 x 3]  
    cyl mean(disp)  mean(hp)
1   4   105.1364  82.63636
2   6   183.3143 122.28571
3   8   353.1000 209.21429

但是这个：

by_cyl <- group_by_(mtcars, cyl)

产生错误：

"Error in as.lazy_dots(list(...)) : object 'cyl' not found"

所以我的问题是下划线版本有什么作用？而且，在什么情况下我想使用它，而不是 "regular" 那个？

谢谢

Answer 1

dplyr 非标准评估小插图在这里有所帮助：http://cran.r-project.org/web/packages/dplyr/vignettes/nse.html

注意：上面的link现在已经过时了，但是可以在包的github页面上找到相同的信息。 https://github.com/tidyverse/dplyr/blob/34423af89703b0772d59edcd0f3485295b629ab0/vignettes/nse.Rmd

Dplyr uses non-standard evaluation (NSE) in all of the most important single table verbs: filter(), mutate(), summarise(), arrange(), select() and group_by(). NSE is important not only to save you typing, but for database backends, is what makes it possible to translate your R code to SQL. However, while NSE is great for interactive use it’s hard to program with. This vignette describes how you can opt out of NSE in dplyr, and instead rely only on SE (along with a little quoting).

...

Every function in dplyr that uses NSE also has a version that uses SE. There’s a consistent naming scheme: the SE is the NSE name with _ on the end. For example, the SE version of summarise() is summarise_(), the SE version of arrange() is arrange_(). These functions work very similarly to their NSE cousins, but the inputs must be “quoted”

dplyr：group_by 和 group_by_ 函数有什么区别？

dplyr: whats the difference between group_by and group_by_ functions?

r

dplyr