如何对 R 中的两个或多个分组变量使用 dlply to 运行函数？

Question

我想使用 plyr 包中的 dlply 将我的数据分成多个 attribute/variable。

函数aggregate用于多个变量，语法为：by=list(data$var1, data$var2)。 dlply 中的等价物是什么？

例如，使用 aggregate，我将使用以下语法：

data(meuse)
aggregate(meuse[,3:7], by=list(meuse$landuse, meuse$dist.m), FUN=mean)
#    Group.1 Group.2   cadmium   copper     lead     zinc      elev
#1        Ah      10  7.500000 56.50000 167.5000  727.500  7.640000
#2        Fw      10  8.300000 77.00000 158.0000  761.000  7.360000
#3         W      10  9.800000 89.11111 299.7778 1090.222  7.449111
#4         W      20  9.075000 81.75000 263.0000 1009.750  6.909000
#5        Ah      30  8.600000 81.00000 277.0000 1141.000  6.983000
#6         W      40  2.700000 27.00000 124.0000  375.000  8.261000
#7        Ah      50 11.700000 85.00000 299.0000 1022.000  7.909000
#8         W      50 18.100000 76.00000 464.0000 1672.000  7.307000
#9        Ah      60  2.400000 47.00000 297.0000  832.000  8.809000
#10       Am      60  7.900000 67.00000 217.0000  833.000  7.784000

这是只有一个变量的例子。

library(plyr)
#one way of calling attribute name
dlply(meuse, "landuse", function (x) lm(x$copper~x$lead))
#another way of calling attribute name
dlply(meuse, as.quoted(.(landuse)), function (x) lm(x$copper~x$lead))

Answer 1

在写这个问题的时候，我探索了旁边弹出的问题，这帮助我更好地理解了 dplyr 的语法。我通过反复试验找到了我的问题的解决方案，但之前的问题实际上没有具体解决我的问题，所以这里是我所苦苦挣扎的简单答案。

我在属性名称周围使用引号 ("") 的原始语法似乎不适用于多个属性。 attribute/variable名字可以在.(和接近)之间，然后可以用两个attributes/variables。

library(plyr)
data(meuse)
dlply(meuse, as.quoted(.(landuse, dist.m)), function (x) lm(x$copper~x$lead))
#as.quoted seems to be optional, this works too
dlply(meuse, .(landuse, dist.m), function (x) lm(x$copper~x$lead))

如何对 R 中的两个或多个分组变量使用 dlply to 运行 函数？

How to use dlply to run function for two or more grouping variables in R?

r

plyr

如何对 R 中的两个或多个分组变量使用 dlply to 运行函数？