单列中有多个数据点的 Pivot_longer

Pivot longer with mutliple data points in a single column

我有一个数据框,在同一列中包含不同数量的数据点:

library(tidyverse)


df <- tribble(~id,  ~data,
              "A", "a;b;c",
              "B",   "e;f")

我想为每个数据点获取一行,分隔列 data 的内容并将其分布在行上。这段代码给出了预期的结果,但是很笨拙:

df %>%
  separate(data,
           into = paste0("dat_",1:5),
           sep = ";",
           fill = "right") %>%
  pivot_longer(starts_with("dat_"),
               names_to = "data_number",
               names_pattern = "dat_(\d+)") %>%
  filter(!is.na(value))

#> # A tibble: 5 x 3
#>   id    data_number value
#>   <chr> <chr>       <chr>
#> 1 A     1           a    
#> 2 A     2           b    
#> 3 A     3           c    
#> 4 B     1           e    
#> 5 B     2           f

首选 Tidyverse 解决方案。

这是一种方法

library(dplyr)
library(tidyr)
library(data.table)
df %>% 
      separate_rows(data) %>%
      mutate(data_number = rowid(id), .before = 2)

-输出

# A tibble: 5 x 3
  id    data_number data 
  <chr>       <int> <chr>
1 A               1 a    
2 A               2 b    
3 A               3 c    
4 B               1 e    
5 B               2 f    
library(dplyr)
library(tidyr)
df %>% 
    separate_rows(data)

输出:

# A tibble: 5 x 2
  id    data 
  <chr> <chr>
1 A     a    
2 A     b    
3 A     c    
4 B     e    
5 B     f  

使用 str_splitunnest -

library(tidyverse)

df %>%
  mutate(data = str_split(data, ';'), 
         data_number = map(data, seq_along)) %>%
  unnest(c(data, data_number))

#  id    data  data_number
#  <chr> <chr>       <int>
#1 A     a               1
#2 A     b               2
#3 A     c               3
#4 B     e               1
#5 B     f               2