使用 dplyr pivot_wider 从长轴转向宽轴?

Pivot from long to wide using dplyr pivot_wider?

我重复测量了长格式的葡萄糖,如下所示:

mydata <- 
  structure(list(
    ID = c(4, 12, 24, 24, 24, 24, 24, 43, 50, 51, 52, 61, 67, 81, 82, 83, 88, 93, 93, 94, 100, 103, 105, 106, 107, 115, 117, 130, 130, 130, 130, 130, 130, 132, 136, 157, 173, 180, 194, 196, 230, 244, 245, 269, 288, 304, 316, 318, 334, 338, 338, 367, 378, 380), 
    date = structure(c(15330, 15476, 17641, 17664, 17664, 17670, 17673, 18696, 18194, 16036, 16428, 16210, 16211, 17667, 16329, 17961, 18535, 16834, 18088, 18571, 16449, 18213, 18003, 17976, 16862, 17842, 18019, 17339, 18513, 18629, 18699, 18700, 18700, 18423, 17184, 17487, 16736, 18780, 16876, 16895, 17163, 17443, 18291, 18493, 18213, 17947, 18452, 17919, 18129, 18152, 18794, 18507, 18640, 18654), 
                     class = "Date"), 
    name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
                     .Label = "gluc", 
                     class = "factor"), 
    value = c(5.6, 5.5, 6.5, 7.6, 7.7, 7.8, 7.4, 4.3, 4.7, 5.1, 4.3, 5.2, 5.1, 5.8, 10, 5.2, 8.7, 4.5, 6.1, 4.6, 6, 5.8, 5.9, 5.5, 5.3, 5.9, 10.1, 6.4, 21.2, 5.1, 5.9, 7.4, NA, 8, 9.5, 4.6, 7, 8.1, 5.5, 7, 5, 6.2, 4.9, 4.8, 8.3, 6, 5.5, 6.8, 6.1, 4.8, 6.3, 5.7, 6.2, 13.7)), 
    row.names = c(NA, -54L), 
    class = c("tbl_df", "tbl", "data.frame"))


head(mydata)

# A tibble: 6 x 4
     ID date       name  value
  <dbl> <date>     <fct> <dbl>
1     4 2011-12-22 gluc    5.6
2    12 2012-05-16 gluc    5.5
3    24 2018-04-20 gluc    6.5
4    24 2018-05-13 gluc    7.6
5    24 2018-05-13 gluc    7.7
6    24 2018-05-19 gluc    7.8

我正在尝试将其转换为宽格式。我试过:

# First try
lab_gluc_wide <- 
  pivot_wider(
    data=mydata, 
    names_from=name, 
    values_from=value, 
    id_cols=c(ID, date))

# Second try
lab_gluc_wide <- 
  pivot_wider(
    data=mydata, 
    names_from=name, 
    values_from=c(value, date), 
    id_cols=ID)

但两者都会产生警告消息

1: Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 
2: Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 

我正在寻找的是每个患者一行,每个葡萄糖有多列 measurement/date。

您的问题是您的 ID 也是唯一的日期,因此如果您将数据重塑为宽格式,您还需要重塑日期列或删除它。在我的示例中,我删除了日期列。

library(tidyverse)
mydata %>%
  group_by(ID) %>%
  mutate(ID_ID = 1:n()) %>%
  ungroup() %>%
  pivot_wider(names_from = c(name, ID_ID),
              id_cols = c(ID))

这给出:

# A tibble: 43 x 7
      ID gluc_1 gluc_2 gluc_3 gluc_4 gluc_5 gluc_6
   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
 1     4    5.6   NA     NA     NA     NA       NA
 2    12    5.5   NA     NA     NA     NA       NA
 3    24    6.5    7.6    7.7    7.8    7.4     NA
 4    43    4.3   NA     NA     NA     NA       NA
 5    50    4.7   NA     NA     NA     NA       NA
 6    51    5.1   NA     NA     NA     NA       NA
 7    52    4.3   NA     NA     NA     NA       NA
 8    61    5.2   NA     NA     NA     NA       NA
 9    67    5.1   NA     NA     NA     NA       NA
10    81    5.8   NA     NA     NA     NA       NA
# ... with 33 more rows