将 pivot_longer 用于多列类

Question

我有一个具有这种结构的数据集（向调查受访者提出了很多问题），我想将其从宽改造成长：

library(tidyverse)
df_wide <-
  tribble(
    ~resp_id, ~question_1_info, ~question_1_answer, ~question_2_info, ~question_2_answer,
    1, "What is your eye color?", 1, "What is your hair color?", 2,
    2, "Are you over 6 ft tall?", 1, "", NA,
    3, "What is your hair color?", 0, "Are you under 40?", 1
  )

这是我想要的输出：

df_long <- 
  tribble(
    ~resp_id, ~question_number, ~question_text, ~question_answer,
    1, 1, "What is your eye color?", 1,
    1, 2, "What is your hair color?", 2,
    
    2, 1, "Are you over 6 ft tall?", 1,
    2, 2, "", NA,
    
    3, 1, "What is your hair color?", 0,
    3, 2, "Are you under 40?", 1
  )

我在让多个类列协同工作时遇到问题。这是我尝试过的：

  df_wide %>% 
  pivot_longer(
    cols = !resp_id,
    names_to = c("question_number"),
    names_prefix = "question_",
    values_to = c("question_text", "question_answer")
  )

我无法获得 names_to 或 names_prefix 和 values_to 的正确配置。

Answer 1

我们可以在重新排列列名中的子字符串后使用 names_pattern

library(dplyr)
library(tidyr)
library(stringr)
df_wide %>%
  # rename the columns by rearranging the digits at the end 
  # "_(\d+)(_.*)" - captures the digits (\d+) after the _
  # and the rest of the characters (_.*) 
  # replace with the backreference (\2, \1) of captured groups rearranged   
  rename_with(~ str_replace(., "_(\d+)(_.*)", "\2_\1"), -resp_id) %>%
  pivot_longer(cols = -resp_id, names_to = c( ".value", "question_number"), 
        names_pattern = "(.*)_(\d+$)")

-输出

# A tibble: 6 × 4
  resp_id question_number question_info              question_answer
    <dbl> <chr>           <chr>                                <dbl>
1       1 1               "What is your eye color?"                1
2       1 2               "What is your hair color?"               2
3       2 1               "Are you over 6 ft tall?"                1
4       2 2               ""                                      NA
5       3 1               "What is your hair color?"               0
6       3 2               "Are you under 40?"                      1

将 pivot_longer 用于多列 类

Using pivot_longer with multiple column classes

r

tidyr

tidyverse

将 pivot_longer 用于多列类