如何将带逗号的分隔列的 tibble 变成整洁的形式
How to turn tibble with comma a delimited column into tidy form
我有以下问题:
df <- tibble::tribble(
~Sample_name, ~CRT, ~SR, ~`Bcells,DendriticCells,Macrophage`,
"S1", 0.079, 0.592, "0.077,0.483,0.555",
"S2", 0.082, 0.549, "0.075,0.268,0.120"
)
df
#> # A tibble: 2 × 4
#> Sample_name CRT SR `Bcells,DendriticCells,Macrophage`
#> <chr> <dbl> <dbl> <chr>
#> 1 S1 0.079 0.592 0.077,0.483,0.555
#> 2 S2 0.082 0.549 0.075,0.268,0.120
请注意,第三列以逗号分隔。如何将 df
转换成这种整洁的形式:
Sample_name CRT SR Score Celltype
S1 0.079 0.592 0.077 Bcells
S1 0.079 0.592 0.483 DendriticCells
S1 0.079 0.592 0.555 Macrophage
S2 0.082 0.549 0.075 Bcells
S2 0.082 0.549 0.268 DendriticCells
S2 0.082 0.549 0.120 Macrophage
我们可以用 separate
:
df %>%
separate(col = `Bcells,DendriticCells,Macrophage`,
into = strsplit('Bcells,DendriticCells,Macrophage', ',')[[1]],
sep = ',') %>%
gather(Celltype, score, Bcells:Macrophage)
# # A tibble: 6 × 5
# Sample_name CRT SR Celltype score
# <chr> <dbl> <dbl> <chr> <chr>
# 1 S1 0.079 0.592 Bcells 0.077
# 2 S2 0.082 0.549 Bcells 0.075
# 3 S1 0.079 0.592 DendriticCells 0.483
# 4 S2 0.082 0.549 DendriticCells 0.268
# 5 S1 0.079 0.592 Macrophage 0.555
# 6 S2 0.082 0.549 Macrophage 0.120
没有硬编码:
cn <- colnames(df)[ncol(df)]
df %>%
separate_(col = cn, into = strsplit(cn, ',')[[1]], sep = ',') %>%
gather_('Celltype', 'score', strsplit(cn, ',')[[1]])
我有以下问题:
df <- tibble::tribble(
~Sample_name, ~CRT, ~SR, ~`Bcells,DendriticCells,Macrophage`,
"S1", 0.079, 0.592, "0.077,0.483,0.555",
"S2", 0.082, 0.549, "0.075,0.268,0.120"
)
df
#> # A tibble: 2 × 4
#> Sample_name CRT SR `Bcells,DendriticCells,Macrophage`
#> <chr> <dbl> <dbl> <chr>
#> 1 S1 0.079 0.592 0.077,0.483,0.555
#> 2 S2 0.082 0.549 0.075,0.268,0.120
请注意,第三列以逗号分隔。如何将 df
转换成这种整洁的形式:
Sample_name CRT SR Score Celltype
S1 0.079 0.592 0.077 Bcells
S1 0.079 0.592 0.483 DendriticCells
S1 0.079 0.592 0.555 Macrophage
S2 0.082 0.549 0.075 Bcells
S2 0.082 0.549 0.268 DendriticCells
S2 0.082 0.549 0.120 Macrophage
我们可以用 separate
:
df %>%
separate(col = `Bcells,DendriticCells,Macrophage`,
into = strsplit('Bcells,DendriticCells,Macrophage', ',')[[1]],
sep = ',') %>%
gather(Celltype, score, Bcells:Macrophage)
# # A tibble: 6 × 5
# Sample_name CRT SR Celltype score
# <chr> <dbl> <dbl> <chr> <chr>
# 1 S1 0.079 0.592 Bcells 0.077
# 2 S2 0.082 0.549 Bcells 0.075
# 3 S1 0.079 0.592 DendriticCells 0.483
# 4 S2 0.082 0.549 DendriticCells 0.268
# 5 S1 0.079 0.592 Macrophage 0.555
# 6 S2 0.082 0.549 Macrophage 0.120
没有硬编码:
cn <- colnames(df)[ncol(df)]
df %>%
separate_(col = cn, into = strsplit(cn, ',')[[1]], sep = ',') %>%
gather_('Celltype', 'score', strsplit(cn, ',')[[1]])