如何使用 data.table 的逻辑行对数组进行子集化？

Question

我基本上想做这样的事情：

all_factors <- c('f1',  'f2', 'f3', 'f4' , 'f5' , 'f6')
    factor_perms <- do.call(CJ, replicate(length(all_factors), c(T, F), FALSE))
      for (j in 2:nrow(factor_perms)){
        factors <- all_factors[factor_perms[j,]])
      }

我得到：all_factors[factor_perms[j, ]] 中的错误：无效的下标类型 'list' 如何将行转换为数组？即删除 data.table

的列名称

Answer 1

因为 factor_perms 是一个 data.table，你需要 unlist 来使它成为一个逻辑数组，例如，

all_factors <- c("f1", "f2", "f3", "f4", "f5", "f6")
factor_perms <- do.call(CJ, replicate(length(all_factors), c(T, F), FALSE))
factors <- vector(mode = "list", nrow(factor_perms) - 1)
for (j in 2:nrow(factor_perms)) {
  factors[[j - 1]] <- all_factors[unlist(factor_perms[j, ])]
}

这样

> head(factors)
[[1]]
[1] "f6"

[[2]]
[1] "f5"

[[3]]
[1] "f5" "f6"

[[4]]
[1] "f4"

[[5]]
[1] "f4" "f6"

[[6]]
[1] "f4" "f5"

Answer 2

有兴趣的可以去掉for循环，一步搞定

wh <- which(as.matrix(factor_perms), arr.ind = TRUE)
factors <- split(all_factors[wh[,2]], wh[,1])

head(factors)
# $`2`
# [1] "f6"
# $`3`
# [1] "f5"
# $`4`
# [1] "f5" "f6"
# $`5`
# [1] "f4"
# $`6`
# [1] "f4" "f6"
# $`7`
# [1] "f4" "f5"

请注意，wh 是列优先，

head(wh)
#      row col
# [1,]  33   1
# [2,]  34   1
# [3,]  35   1
# [4,]  36   1
# [5,]  37   1
# [6,]  38   1

split 步骤根据不同的 row 值对其输出进行排序，因此按行排序。

这个值取决于您的需要：根据您对 R 的熟悉程度，一个可能比另一个更容易阅读（因此维护）；这个非 for 循环（使用此数据）是另一个循环的 60 倍。诚然，分析需要几微秒的代码是一种愚蠢（低效）的消磨时间的方式，但如果你的数据大得多，这可能会有优势。

  expression          min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory      time     gc      
  <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>      <list>   <list>  
1 r2              116.3us  125.8us     6670.   22.26KB     4.44  3003     2      450ms <NULL> <Rprofmem[~ <bch:tm~ <tibble~
2 ThomasIsCoding   8.46ms   8.95ms      109.    1.02MB     2.09    52     1      478ms <NULL> <Rprofmem[~ <bch:tm~ <tibble~

如何使用 data.table 的逻辑行对数组进行子集化？

How can I subset an array using the logical row of a data.table?

r

subset

data.table