如何将带逗号的分隔列的 tibble 变成整洁的形式

How to turn tibble with comma a delimited column into tidy form

我有以下问题:


df <- tibble::tribble(
  ~Sample_name, ~CRT,      ~SR,      ~`Bcells,DendriticCells,Macrophage`,
  "S1",          0.079,  0.592,      "0.077,0.483,0.555",
  "S2",          0.082,  0.549,      "0.075,0.268,0.120"
)

df
#> # A tibble: 2 × 4
#>   Sample_name   CRT    SR `Bcells,DendriticCells,Macrophage`
#>         <chr> <dbl> <dbl>                              <chr>
#> 1          S1 0.079 0.592                  0.077,0.483,0.555
#> 2          S2 0.082 0.549                  0.075,0.268,0.120

请注意,第三列以逗号分隔。如何将 df 转换成这种整洁的形式:

Sample_name CRT   SR       Score     Celltype
S1          0.079 0.592    0.077     Bcells 
S1          0.079 0.592    0.483     DendriticCells
S1          0.079 0.592    0.555     Macrophage
S2          0.082 0.549    0.075     Bcells
S2          0.082 0.549    0.268     DendriticCells
S2          0.082 0.549    0.120     Macrophage

我们可以用 separate:

df %>%
    separate(col = `Bcells,DendriticCells,Macrophage`,
             into = strsplit('Bcells,DendriticCells,Macrophage', ',')[[1]],
             sep = ',') %>%
    gather(Celltype, score, Bcells:Macrophage)
# # A tibble: 6 × 5
#   Sample_name   CRT    SR       Celltype score
# <chr> <dbl> <dbl>          <chr> <chr>
# 1          S1 0.079 0.592         Bcells 0.077
# 2          S2 0.082 0.549         Bcells 0.075
# 3          S1 0.079 0.592 DendriticCells 0.483
# 4          S2 0.082 0.549 DendriticCells 0.268
# 5          S1 0.079 0.592     Macrophage 0.555
# 6          S2 0.082 0.549     Macrophage 0.120

没有硬编码:

cn <- colnames(df)[ncol(df)]
df %>%
    separate_(col = cn, into = strsplit(cn, ',')[[1]],  sep = ',') %>%
    gather_('Celltype', 'score', strsplit(cn, ',')[[1]])