分组和使用 lapply (data.table) 时如何处理 j 中的 data.frame 输出
How to handle data.frame output in j when grouping by and using lapply (data.table)
我不知道如何在使用 data.table 时处理 j 中的数据帧输出。我希望 MRE 是不言自明的:
library(data.table)
library(TTR)
? BBands
data(ttrc)
dt <- as.data.table(ttrc)
dt$symbol <- "a"
dt$symbol[1:200] <- "b"
window_sizes <- c(5, 22)
new_cols <- expand.grid("bbands", c("dn", "mavg", "up", "pctB"), window_sizes)
new_cols <- paste(new_cols$Var1, new_cols$Var2, new_cols$Var3, sep = "_")
output_bbands <- dt[, (new_cols) := lapply(window_sizes, function(w) BBands(Close, n = w)), by = symbol]
代码returns错误:
Error in `[.data.table`(dt, , lapply(window_sizes, function(w) BBands(Close, :
All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead (much quicker), or cbind or merge afterward.
函数returns 4列。我想将所有 4 列添加到我的 dt。
您可以使用 do.call
和 cbind
将一个数据帧中的数据帧列表合并到原始数据帧。
cbind(dt, setNames(do.call(cbind.data.frame,
lapply(window_sizes, function(w) BBands(dt$Close, n = w))), new_cols))
根据 Op 的代码,BBands
应该按 'symbol' 分组应用,并且发生错误只是因为输出是 list
of matrix
es .例如。如果我们对整个数据执行此操作,则结构将是
str(lapply(window_sizes, function(w) BBands(dt$Close, n = w)))
List of 2
$ : num [1:5550, 1:4] NA NA NA NA 3.07 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
$ : num [1:5550, 1:4] NA NA NA NA NA NA NA NA NA NA ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
我们可以通过 cbind
保持 data.table
代码相似并按 'symbol'
分组来对 OP 的代码进行小的更改
dt[, (new_cols) := do.call(cbind, lapply(window_sizes, function(w)
as.data.frame(BBands(Close, n = w)))), by = symbol]
head(dt)
Date Open High Low Close Volume symbol bbands_dn_5 bbands_mavg_5 bbands_up_5 bbands_pctB_5 bbands_dn_22 bbands_mavg_22 bbands_up_22 bbands_pctB_22
1: 1985-01-02 3.18 3.18 3.08 3.08 1870906 b NA NA NA NA NA NA NA NA
2: 1985-01-03 3.09 3.15 3.09 3.11 3099506 b NA NA NA NA NA NA NA NA
3: 1985-01-04 3.11 3.12 3.08 3.09 2274157 b NA NA NA NA NA NA NA NA
4: 1985-01-07 3.09 3.12 3.07 3.10 2086758 b NA NA NA NA NA NA NA NA
5: 1985-01-08 3.10 3.12 3.08 3.11 2166348 b 3.074676 3.098 3.121324 0.7572479 NA NA NA NA
6: 1985-01-09 3.12 3.17 3.10 3.16 3441798 b 3.065668 3.114 3.162332 0.9758734 NA NA NA NA
注意:如果我们只对整列执行 lapply
,答案将不正确,因为在应用 BBands
时没有分组属性
我不知道如何在使用 data.table 时处理 j 中的数据帧输出。我希望 MRE 是不言自明的:
library(data.table)
library(TTR)
? BBands
data(ttrc)
dt <- as.data.table(ttrc)
dt$symbol <- "a"
dt$symbol[1:200] <- "b"
window_sizes <- c(5, 22)
new_cols <- expand.grid("bbands", c("dn", "mavg", "up", "pctB"), window_sizes)
new_cols <- paste(new_cols$Var1, new_cols$Var2, new_cols$Var3, sep = "_")
output_bbands <- dt[, (new_cols) := lapply(window_sizes, function(w) BBands(Close, n = w)), by = symbol]
代码returns错误:
Error in `[.data.table`(dt, , lapply(window_sizes, function(w) BBands(Close, :
All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead (much quicker), or cbind or merge afterward.
函数returns 4列。我想将所有 4 列添加到我的 dt。
您可以使用 do.call
和 cbind
将一个数据帧中的数据帧列表合并到原始数据帧。
cbind(dt, setNames(do.call(cbind.data.frame,
lapply(window_sizes, function(w) BBands(dt$Close, n = w))), new_cols))
根据 Op 的代码,BBands
应该按 'symbol' 分组应用,并且发生错误只是因为输出是 list
of matrix
es .例如。如果我们对整个数据执行此操作,则结构将是
str(lapply(window_sizes, function(w) BBands(dt$Close, n = w)))
List of 2
$ : num [1:5550, 1:4] NA NA NA NA 3.07 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
$ : num [1:5550, 1:4] NA NA NA NA NA NA NA NA NA NA ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
我们可以通过 cbind
保持 data.table
代码相似并按 'symbol'
dt[, (new_cols) := do.call(cbind, lapply(window_sizes, function(w)
as.data.frame(BBands(Close, n = w)))), by = symbol]
head(dt)
Date Open High Low Close Volume symbol bbands_dn_5 bbands_mavg_5 bbands_up_5 bbands_pctB_5 bbands_dn_22 bbands_mavg_22 bbands_up_22 bbands_pctB_22
1: 1985-01-02 3.18 3.18 3.08 3.08 1870906 b NA NA NA NA NA NA NA NA
2: 1985-01-03 3.09 3.15 3.09 3.11 3099506 b NA NA NA NA NA NA NA NA
3: 1985-01-04 3.11 3.12 3.08 3.09 2274157 b NA NA NA NA NA NA NA NA
4: 1985-01-07 3.09 3.12 3.07 3.10 2086758 b NA NA NA NA NA NA NA NA
5: 1985-01-08 3.10 3.12 3.08 3.11 2166348 b 3.074676 3.098 3.121324 0.7572479 NA NA NA NA
6: 1985-01-09 3.12 3.17 3.10 3.16 3441798 b 3.065668 3.114 3.162332 0.9758734 NA NA NA NA
注意:如果我们只对整列执行 lapply
,答案将不正确,因为在应用 BBands