使用 R/tidyverse 中的 pivot_wider() 将所有列旋转得更宽(ID 列除外)

Pivot all columns wider (except ID columns) using pivot_wider() in R/tidyverse

我正在尝试在 R 中将数据框从长数据框转换为宽数据框。我正在尝试使用 pivot_wider() 将所有列旋转得更宽(唯一标识观察值的列除外)。这是一个最小的工作示例:

library("tidyr")

set.seed(12345)

sampleSize <- 10
timepoints <- 3
raters <- 2

data_long <- data.frame(ID = rep(1:sampleSize, each = timepoints * raters),
                        time = rep(1:timepoints, times = sampleSize * raters),
                        rater = rep(c("a","b"), times = sampleSize * timepoints),
                        v1 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        v2 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        v3 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        v100 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vA = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vB = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vC = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vZZ = sample.int(99, sampleSize * timepoints * raters, replace = TRUE))

数据如下:

> tibble(data_long)
# A tibble: 60 x 11
      ID  time rater    v1    v2    v3  v100    vA    vB    vC   vZZ
   <int> <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
 1     1     1 a        14    56    30    75    66    22     8    73
 2     1     1 b        90    44    99     8    36    72     1    78
 3     1     2 a        92    35    93    46     4    68    39    52
 4     1     2 b        51    91    50    67    43    72    99    74
 5     1     3 a        80    34    31    31    21    52     7    23
 6     1     3 b        24    86    25    86    20    43    74    89
 7     2     1 a        58    51    48    60     6    56    66    37
 8     2     1 b        96    95    76     1    78     2    65     3
 9     2     2 a        88    26    92    86     7    37    84    15
10     2     2 b        93    55    25    62    27    39    73    85
# ... with 50 more rows

在这个例子中,我有三个列来唯一标识所有观察值:IDtimerater。我想将每个其他列加宽 rater(即不包括 IDtime 列)。我的预期输出是:

# A tibble: 30 x 18
      ID  time  v1_a  v1_b  v2_a  v2_b  v3_a  v3_b v100_a v100_b  vA_a  vA_b  vB_a  vB_b  vC_a  vC_b vZZ_a vZZ_b
   <int> <int> <int> <int> <int> <int> <int> <int>  <int>  <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1     1     1    14    90    56    44    30    99     75      8    66    36    22    72     8     1    73    78
 2     1     2    92    51    35    91    93    50     46     67     4    43    68    72    39    99    52    74
 3     1     3    80    24    34    86    31    25     31     86    21    20    52    43     7    74    23    89
 4     2     1    58    96    51    95    48    76     60      1     6    78    56     2    66    65    37     3
 5     2     2    88    93    26    55    92    25     86     62     7    27    37    39    84    73    15    85
 6     2     3    75     2    23    55    28     8     66     74    65    92    58    10    91    65     7    44
 7     3     1    86    94     7    87    78    85     38     87    36    49    89    83    33    34    32    38
 8     3     2    10    75    12    15    21    18     56     77    54    17    61    92    18    50    98    27
 9     3     3    38    81    46    90    20    47     88     15    33    95    66    19    12    27    84    52
10     4     1    32    38    88    68    77    71     10     81    21    54    33    16    90    41    29    72
# ... with 20 more rows

我可以使用以下语法加宽任何给定的列:

data_long %>% 
  pivot_wider(names_from = rater, values_from = c(v1, v2))

因此,我可以通过在向量中手动输入所有列来加宽所有列:

data_long %>% 
  pivot_wider(names_from = rater, values_from = c(v1, v2, v3, v100, vA, vB, vC, vZZ))

但是,如果我有很多列,这会变得笨拙。另一种方法是通过指定列的范围来加宽列:

data_long %>% 
  pivot_wider(names_from = rater, values_from = v1:vZZ)

但是,如果要加宽的所有列不在一个范围内,例如如果 ID 列散布在整个数据框中(尽管可以指定多个范围),则此方法效果不佳。

有没有一种方法可以使用 pivot_wider() 来扩展 ALLexcept 对于我指定为列的任何列使用 id_cols(即 IDtime)唯一标识每个观察值。我希望解决方案可以扩展到我有很多列的情况(因此不想指定变量名或要扩大的变量范围)。

正如我们所知,前 3 列应该是固定的,请在 values_from

中的那些列名称上使用 -
library(dplyr)
library(tidyr)
data_long %>% 
   pivot_wider(names_from = rater, values_from = -names(.)[1:3])

或者如果我们已经创建了一个对象

id_cols <- c("ID", "time")
data_long %>%
    pivot_wider(names_from = rater, values_from = -all_of(id_cols))