使用 data.table 按组计算马哈拉诺比斯距离(第二部分)

Computing mahalanobis distance by group using data.table (part II)

这是我对 提出的问题的延续。我的示例数据和代码是:

library(data.table)
library(StatMatch)
as.data.table(mtcars)[,tryCatch(mahalanobis.dist(mpg[vs == 0], mpg[vs == 
1]),error=function(e) as.numeric(NA)), keyby = carb]

 carb        V1
 1:    2 1.0416378
 2:    2 1.6264169
 3:    2 1.6812399
 4:    2 0.9502661
 5:    2 0.2923896
 6:    2 0.7492482
 7:    2 1.3340273
 8:    2 1.3888504
 9:    2 0.6578765
10:    2 0.5847791

...省略
碳水化合物 V1

以上代码在一列中给出了所有值。但是,我想要以下格式的输出(如果可能)。

如何将输出 table 转换为以下格式:

  +-----------------------------------------------------------------+
     | carb          x1          x2         x3          x4          x5 |
     |-----------------------------------------------------------------|
  1. |    2   1.0416378    1.626417    1.68124   0.9502661   0.2923896 |
  2. |    2   0.7492482    1.334027    1.38885   0.6578765   0.5847791 |
  3. |    2   2.1380986    2.722878   2.777701   2.0467269   0.8040713 |
  4. |    2   2.1380986    2.722878   2.777701   2.0467269   0.8040713 |
  5. |    2   0.4934074    1.078186    1.13301   0.4020356     0.84062 |
     |-----------------------------------------------------------------|
  6. |    3          NA          NA         NA          NA          NA |
  7. |    4   0.4602308   0.8181881         NA          NA          NA |
  8. |    4   0.4602308   0.8181881         NA          NA          NA |
  9. |    4   1.2528505   0.8948932         NA          NA          NA |
 10. |    4   2.2500173     1.89206         NA          NA          NA |
     |-----------------------------------------------------------------|
 11. |    4   2.2500173     1.89206         NA          NA          NA |
 12. |    4    1.150577   0.7926197         NA          NA          NA |
 13. |    4   1.5085343    1.150577         NA          NA          NA |
 14. |    4   0.8693248   0.5113676         NA          NA          NA |
 15. |    6          NA          NA         NA          NA          NA |
     |-----------------------------------------------------------------|
 16. |    8          NA          NA         NA          NA          NA |
     +-----------------------------------------------------------------+

解释:对于碳水化合物 2,马氏距离如下所示:

           1        2        3         4         5
1 1.0416378 1.626417 1.681240 0.9502661 0.2923896
2 0.7492482 1.334027 1.388850 0.6578765 0.5847791
3 2.1380986 2.722878 2.777701 2.0467269 0.8040713
4 2.1380986 2.722878 2.777701 2.0467269 0.8040713
5 0.4934074 1.078186 1.133010 0.4020356 0.8406200

For carb 4: 
          1         2
1 0.4602308 0.8181881
2 0.4602308 0.8181881
3 1.2528505 0.8948932
4 2.2500173 1.8920600
5 2.2500173 1.8920600
6 1.1505770 0.7926197
7 1.5085343 1.1505770
8 0.8693248 0.5113676

对于碳水化合物 3、碳水化合物 6 和碳水化合物 8:无法计算马哈拉诺比斯距离,因此我们对所有列都有 NA。

我可以将 lapplyrbindlist 一起使用,如下所示:

  rbindlist(lapply(unique(mtcars$carb),function(i) with(mtcars,
data.frame(tryCatch(mahalanobis.dist(mpg[vs == 0 & carb==i],
mpg[vs== 1 & carb==i]),error=function(e) as.numeric(NA))))),fill=TRUE)
[,-c(6,7,8),with=FALSE]
           X1        X2        X3        X4        X5
 1: 1.0416378 0.7492482 2.1380986 2.1380986 0.4934074
 2: 1.6264169 1.3340273 2.7228777 2.7228777 1.0781865
 3: 1.6812399 1.3888504 2.7777008 2.7777008 1.1330095
 4: 0.9502661 0.6578765 2.0467269 2.0467269 0.4020356
 5: 0.2923896 0.5847791 0.8040713 0.8040713 0.8406200
 6:        NA        NA        NA        NA        NA
 7: 0.4602308 0.8181881        NA        NA        NA
 8: 0.4602308 0.8181881        NA        NA        NA
 9: 1.2528505 0.8948932        NA        NA        NA
10: 2.2500173 1.8920600        NA        NA        NA
11: 2.2500173 1.8920600        NA        NA        NA
12: 1.1505770 0.7926197        NA        NA        NA
13: 1.5085343 1.1505770        NA        NA        NA
14: 0.8693248 0.5113676        NA        NA        NA
15:        NA        NA        NA        NA        NA
16:        NA        NA        NA        NA        NA

我正在寻找不使用 lapply 的解决方案。

您可以让 tryCatch 块中的 return 值始终是正确的尺寸,然后再重建矩阵。 carb = 1.

开头多了一行 NA
res <- as.data.table(mtcars)[,tryCatch({
    mat <- mahalanobis.dist(mpg[vs == 0], mpg[vs == 1])
    t(cbind(mat, matrix(NA, nrow=nrow(mat), ncol=5-ncol(mat))))  # add in NA values to fill out columns
   }, error=function(e) rep(as.numeric(NA), 5)), keyby = carb]   # return 5-vector on error

matrix(res[[2]], ncol=5, byrow = T)                              # rebuild matrix
#            [,1]      [,2]      [,3]      [,4]      [,5]
#  [1,]        NA        NA        NA        NA        NA
#  [2,] 1.0416378 0.7492482 2.1380986 2.1380986 0.4934074
#  [3,] 1.6264169 1.3340273 2.7228777 2.7228777 1.0781865
#  [4,] 1.6812399 1.3888504 2.7777008 2.7777008 1.1330095
#  [5,] 0.9502661 0.6578765 2.0467269 2.0467269 0.4020356
#  [6,] 0.2923896 0.5847791 0.8040713 0.8040713 0.8406200
#  [7,]        NA        NA        NA        NA        NA
#  [8,] 0.4602308 0.8181881        NA        NA        NA
#  [9,] 0.4602308 0.8181881        NA        NA        NA
# [10,] 1.2528505 0.8948932        NA        NA        NA
# [11,] 2.2500173 1.8920600        NA        NA        NA
# [12,] 2.2500173 1.8920600        NA        NA        NA
# [13,] 1.1505770 0.7926197        NA        NA        NA
# [14,] 1.5085343 1.1505770        NA        NA        NA
# [15,] 0.8693248 0.5113676        NA        NA        NA
# [16,]        NA        NA        NA        NA        NA
# [17,]        NA        NA        NA        NA        NA