甚至根据另一个变量进行削减

even cuts based on another variable

我们如何对 1 个变量进行切割,以确保这些切割的另一个变量的总和是偶数?

例如

我希望 var2 的总和在切割之间更均匀

代码:

library(data.table)
dt = data.table(var1=c(0.6,0.2,0.5,0.8,0.10,0.1,0.2,0.5,0.3,0.5),
                var2=c(20,400,350,50,100,490,1200,900,1850,70))
dt[,cuts:=cut(dt$var1,breaks=3)]
dt[,.(var2=sum(var2)),by=cuts]

谢谢!

一种方法是创建一个向量,让您的 var1 值与其 var2 值成比例,然后使用该向量创建相等的 bin,例如,

library(data.table)
library(Hmisc)

dt = data.table(var1=c(0.6,0.2,0.5,0.8,0.10,0.1,0.2,0.5,0.3,0.5),
                var2=c(20,400,350,50,100,490,1200,900,1850,70))

dt[,var3:=round(var2/min(var2))]

cc = rep(dt[,var1], dt[,var3])

labs = cut2(cc, g=3, onlycuts = TRUE)

dt[,cuts:=cut2(var1, cuts=labs)]

dt[,.(var2=sum(var2)),by=cuts]

#         cuts var2
# 1: [0.5,0.8] 1390
# 2: [0.1,0.3) 2190
# 3:       0.3 1850