r data.table 计算值列与另一列标识的值的比率

r data.table calculate ratio of value column to value identified by another column

我有一个数据 table,其中包含两列(crop 和 plotnum),它们标识组 (crop) 和组的元素 (plotnum)。我想计算组内子组的值与组中其中一个子组的值的比率(例如 plot_4)。思路是比较组内子组的表现。

这是示例数据 table。

DT <- data.table(value = c(71.03528, 212.1139, 91.72795, 37.65809, 194.7215, 334.2902, 371.2177, 122.9278, 218.7671, 552.3567, 193.3274, 147.1677, 197.4963, 194.9721, 156.7755, 184.1638), 
                 crop = c("BellPepper", "BellPepper", "BellPepper", "BellPepper", "ChiliPepper", "ChiliPepper", "ChiliPepper", "ChiliPepper", "Eggplant", "Eggplant", "Eggplant", "Eggplant", "Kale", "Kale", "Kale", "Kale"),
                 plotnum = c("plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4"
                 ))

一种方法是在新列中复制组的 plot_4 值,然后计算值列与 plot_4 列的比率,但我不能找出如何获取子组值并仅填充 plot_4 列中组的单元格。也欢迎其他方法!

我们可以根据子组逐个分组

library(data.table)
DT[, ratio := value/value[plotnum == 'plot_4'], by = crop]

如果您想将每一行与 所有 进行比较,则扩展是唯一的 plotnum:

DT[, (unique(DT$plotnum)) :=
    lapply(setNames(nm=unique(plotnum)), function(pn) value/value[plotnum == pn]), 
    by = crop]
#         value        crop plotnum    plot_1    plot_2    plot_3    plot_4
#         <num>      <char>  <char>     <num>     <num>     <num>     <num>
#  1:  71.03528  BellPepper  plot_1 1.0000000 0.3348921 0.7744126 1.8863219
#  2: 212.11390  BellPepper  plot_2 2.9860360 1.0000000 2.3124239 5.6326250
#  3:  91.72795  BellPepper  plot_3 1.2913013 0.4324467 1.0000000 2.4358099
#  4:  37.65809  BellPepper  plot_4 0.5301322 0.1775371 0.4105411 1.0000000
#  5: 194.72150 ChiliPepper  plot_1 1.0000000 0.5824924 0.5245480 1.5840314
#  6: 334.29020 ChiliPepper  plot_2 1.7167606 1.0000000 0.9005233 2.7194028
#  7: 371.21770 ChiliPepper  plot_3 1.9064032 1.1104654 1.0000000 3.0198027
#  8: 122.92780 ChiliPepper  plot_4 0.6313006 0.3677278 0.3311475 1.0000000
#  9: 218.76710    Eggplant  plot_1 1.0000000 0.3960613 1.1315887 1.4865157
# 10: 552.35670    Eggplant  plot_2 2.5248618 1.0000000 2.8571051 3.7532468
# 11: 193.32740    Eggplant  plot_3 0.8837133 0.3500046 1.0000000 1.3136537
# 12: 147.16770    Eggplant  plot_4 0.6727140 0.2664360 0.7612356 1.0000000
# 13: 197.49630        Kale  plot_1 1.0000000 1.0129465 1.2597396 1.0723948
# 14: 194.97210        Kale  plot_2 0.9872190 1.0000000 1.2436388 1.0586885
# 15: 156.77550        Kale  plot_3 0.7938149 0.8040920 1.0000000 0.8512829
# 16: 184.16380        Kale  plot_4 0.9324924 0.9445649 1.1746976 1.0000000

虽然这仅在您打算将一个与所有其他 plotnums 进行比较时才有用。如果你不需要扩展所有的plotnums,那么inner方法是基于另一个答案的,所以应该接受一个。