分组和使用 lapply (data.table) 时如何处理 j 中的 data.frame 输出

How to handle data.frame output in j when grouping by and using lapply (data.table)

我不知道如何在使用 data.table 时处理 j 中的数据帧输出。我希望 MRE 是不言自明的:

  library(data.table)
  library(TTR)
  ? BBands
  data(ttrc)
  dt <- as.data.table(ttrc)
  dt$symbol <- "a"
  dt$symbol[1:200] <- "b"
  window_sizes <- c(5, 22)
  new_cols <- expand.grid("bbands", c("dn", "mavg", "up", "pctB"), window_sizes)
  new_cols <- paste(new_cols$Var1, new_cols$Var2, new_cols$Var3, sep = "_")
  output_bbands <- dt[, (new_cols) := lapply(window_sizes, function(w) BBands(Close, n = w)), by = symbol]

代码returns错误:

Error in `[.data.table`(dt, , lapply(window_sizes, function(w) BBands(Close,  : 
  All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead (much quicker), or cbind or merge afterward.

函数returns 4列。我想将所有 4 列添加到我的 dt。

您可以使用 do.callcbind 将一个数据帧中的数据帧列表合并到原始数据帧。

cbind(dt, setNames(do.call(cbind.data.frame, 
      lapply(window_sizes, function(w) BBands(dt$Close, n = w))), new_cols))

根据 Op 的代码,BBands 应该按 'symbol' 分组应用,并且发生错误只是因为输出是 list of matrixes .例如。如果我们对整个数据执行此操作,则结构将是

str(lapply(window_sizes, function(w) BBands(dt$Close, n = w)))
List of 2
 $ : num [1:5550, 1:4] NA NA NA NA 3.07 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
 $ : num [1:5550, 1:4] NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"

我们可以通过 cbind 保持 data.table 代码相似并按 'symbol'

分组来对 OP 的代码进行小的更改
dt[, (new_cols) := do.call(cbind, lapply(window_sizes, function(w) 
           as.data.frame(BBands(Close, n = w)))), by = symbol] 
head(dt)
          Date Open High  Low Close  Volume symbol bbands_dn_5 bbands_mavg_5 bbands_up_5 bbands_pctB_5 bbands_dn_22 bbands_mavg_22 bbands_up_22 bbands_pctB_22
 1: 1985-01-02 3.18 3.18 3.08  3.08 1870906      b          NA            NA          NA            NA           NA             NA           NA             NA
 2: 1985-01-03 3.09 3.15 3.09  3.11 3099506      b          NA            NA          NA            NA           NA             NA           NA             NA
 3: 1985-01-04 3.11 3.12 3.08  3.09 2274157      b          NA            NA          NA            NA           NA             NA           NA             NA
 4: 1985-01-07 3.09 3.12 3.07  3.10 2086758      b          NA            NA          NA            NA           NA             NA           NA             NA
 5: 1985-01-08 3.10 3.12 3.08  3.11 2166348      b    3.074676         3.098    3.121324     0.7572479           NA             NA           NA             NA
 6: 1985-01-09 3.12 3.17 3.10  3.16 3441798      b    3.065668         3.114    3.162332     0.9758734           NA             NA           NA             NA

注意:如果我们只对整列执行 lapply,答案将不正确,因为在应用 BBands

时没有分组属性