从模拟变量中获取均值和区间
Get mean and interval from simulated var
我有这样的数据集:
library(data.table)
library(EnvStats)
library(bayestestR)
DT <- data.table(MEAN = c(0.5,0.7,0.9),MIN = c(0.4,0.6,0.8),MAX = c(0.6,0.8,1),REF = rnorm(3,1000,200))
我用变量 MEAN
、MIN
和 MAX
的模拟值计算了一个变量。
DT[,Sim_rtri := list(REF*(1+rtri(n = 1000,min = MIN,max = MAX,mode = MEAN)))]
但是我得到了每一行相同的值,即使我需要模拟来获取每一行的值。我该怎么做?
而且,我想使用两个变量,一个的平均值为 var Sim_rtri
,另一个的间隔为该 var,我试过这个:
DT[,Mean_Sim_rtri := mean(Sim_rtri)]
DT[,Int_Sim_rtri := ci(Sim_rtri, method = "ETI",ci = .95)]
但是我从中得到了错误。我还能做什么?
当你不分配你的第一行代码时,它会变得更清楚:
set.seed(42)
DT <- data.table(MEAN = c(0.5,0.7,0.9),MIN = c(0.4,0.6,0.8),MAX = c(0.6,0.8,1),REF = rnorm(3,1000,200))
DT[,list(REF*(1+rtri(n = 1000,min = MIN,max = MAX,mode = MEAN)))]
V1
1: 1946.223
2: 1465.333
3: 2056.410
4: 1940.845
5: 1504.171
---
996: 1968.724
997: 1962.222
998: 1511.566
999: 2037.884
1000: 1810.734
Warning message:
In REF * (1 + rtri(n = 1000, min = MIN, max = MAX, mode = MEAN)) :
longer object length is not a multiple of shorter object length
它正在创建一个长度为 1000 的列表而不是 3 个 list-columns(每个 1000),因为它正在回收 data.table
中的值(注意 general[= V1
的 22=] 模式是 ~1900...1500...2000
。无论如何,可能有更惯用的/data.table
方法来解决问题,但使用 Map()
更符合您期望的结果?
set.seed(42)
DT <- data.table(MEAN = c(0.5,0.7,0.9),MIN = c(0.4,0.6,0.8),MAX = c(0.6,0.8,1),REF = rnorm(3,1000,200))
DT[, Sim_rtri := Map(function(w, x, y, z) w*(1+rtri(n = 1000,min = x,max = y,mode = z)), REF, MIN, MAX, MEAN)]
DT[, Mean_Sim_rtri := sapply(Sim_rtri, mean)]
DT[, Int_Sim_rtri := lapply(Sim_rtri, ci, method = "ETI",ci = .95)]
DT
MEAN MIN MAX REF Sim_rtri Mean_Sim_rtri Int_Sim_rtri
1: 0.5 0.4 0.6 1274.1917 1946.223,1849.996,1933.170,1940.845,1905.784,1943.204,... 1908.901 <bayestestR_eti>
2: 0.7 0.6 0.8 887.0604 1512.938,1530.315,1480.203,1542.298,1500.740,1513.961,... 1507.717 <bayestestR_eti>
3: 0.9 0.8 1.0 1072.6257 2055.113,2085.123,1991.335,2022.209,2010.288,1984.313,... 2038.466 <bayestestR_eti>
我有这样的数据集:
library(data.table)
library(EnvStats)
library(bayestestR)
DT <- data.table(MEAN = c(0.5,0.7,0.9),MIN = c(0.4,0.6,0.8),MAX = c(0.6,0.8,1),REF = rnorm(3,1000,200))
我用变量 MEAN
、MIN
和 MAX
的模拟值计算了一个变量。
DT[,Sim_rtri := list(REF*(1+rtri(n = 1000,min = MIN,max = MAX,mode = MEAN)))]
但是我得到了每一行相同的值,即使我需要模拟来获取每一行的值。我该怎么做?
而且,我想使用两个变量,一个的平均值为 var Sim_rtri
,另一个的间隔为该 var,我试过这个:
DT[,Mean_Sim_rtri := mean(Sim_rtri)]
DT[,Int_Sim_rtri := ci(Sim_rtri, method = "ETI",ci = .95)]
但是我从中得到了错误。我还能做什么?
当你不分配你的第一行代码时,它会变得更清楚:
set.seed(42)
DT <- data.table(MEAN = c(0.5,0.7,0.9),MIN = c(0.4,0.6,0.8),MAX = c(0.6,0.8,1),REF = rnorm(3,1000,200))
DT[,list(REF*(1+rtri(n = 1000,min = MIN,max = MAX,mode = MEAN)))]
V1
1: 1946.223
2: 1465.333
3: 2056.410
4: 1940.845
5: 1504.171
---
996: 1968.724
997: 1962.222
998: 1511.566
999: 2037.884
1000: 1810.734
Warning message:
In REF * (1 + rtri(n = 1000, min = MIN, max = MAX, mode = MEAN)) :
longer object length is not a multiple of shorter object length
它正在创建一个长度为 1000 的列表而不是 3 个 list-columns(每个 1000),因为它正在回收 data.table
中的值(注意 general[= V1
的 22=] 模式是 ~1900...1500...2000
。无论如何,可能有更惯用的/data.table
方法来解决问题,但使用 Map()
更符合您期望的结果?
set.seed(42)
DT <- data.table(MEAN = c(0.5,0.7,0.9),MIN = c(0.4,0.6,0.8),MAX = c(0.6,0.8,1),REF = rnorm(3,1000,200))
DT[, Sim_rtri := Map(function(w, x, y, z) w*(1+rtri(n = 1000,min = x,max = y,mode = z)), REF, MIN, MAX, MEAN)]
DT[, Mean_Sim_rtri := sapply(Sim_rtri, mean)]
DT[, Int_Sim_rtri := lapply(Sim_rtri, ci, method = "ETI",ci = .95)]
DT
MEAN MIN MAX REF Sim_rtri Mean_Sim_rtri Int_Sim_rtri
1: 0.5 0.4 0.6 1274.1917 1946.223,1849.996,1933.170,1940.845,1905.784,1943.204,... 1908.901 <bayestestR_eti>
2: 0.7 0.6 0.8 887.0604 1512.938,1530.315,1480.203,1542.298,1500.740,1513.961,... 1507.717 <bayestestR_eti>
3: 0.9 0.8 1.0 1072.6257 2055.113,2085.123,1991.335,2022.209,2010.288,1984.313,... 2038.466 <bayestestR_eti>