将宽 table 转换为长 table 选择带有 "start_with" 的列
Convert wide table into long table selecting columns with "start_with"
我有一个 table 看起来像这样:
Z1
R1
Z2
R2
...
Z100
R100
1246
1
2986
3
...
3163
4
2734
5
1066
7
...
2645
8
它是一个宽 table,我想将其转换成这样的长 table:
Z
时间(毫秒)
R
反应
Z1
1246
R1
1
Z1
2734
R1
5
Z2
2986
R2
3
Z2
1066
R2
7
...
...
...
...
Z100
3163
R100
4
Z100
2645
R100
8
我试过这个:
data_time_config_long <- data_time_config %>%
gather(key = "Z", value = "Time in ms", select(data_time_config, starts_with('Z'))) %>%
gather(key = "R", value = "Reaction", select(data_time_config, starts_with('R')))
我收到这个错误:
Error: Must subset columns with a valid subscript vector. x Subscript has the wrong type `tbl_df< Z1 : double Z2 : double Z3 : double Z4 : double Z5 : double Z6 : double Z7 : double Z8 : double Z9 : double Z10 : double Z11 : double Z12 : double Z13 : double Z14 : double Z15 : double Z16 : double Z17 : double Z18 : double Z19 : double Z20 : double Z21 : double Z22 : double Z23 : double Z24 : double Z25 : double Z26 : double Z27 : double Z28 : double Z29 : double Z30 : double Z31 : double Z32 : double Z33 : double Z34 : double Z35 : double Z36 : double Z37 : double Z38 : double Z39 : double Z40 : double Z41 : double Z42 : double Z43 : double Z44 : double Z45 : double Z46 : double Z47 : double Z48 : double Z49 : double Z50 : double Z51 : double Z52 : double Z53 : double Z54 : double Z55 : double Z56 : double Z57 : double Z58 : double Z59 : double Z60
我做错了什么?
我认为这个问题不能直接用 tidyr::gather() 解决,它已被弃用,应该使用 tidyr::pivot_longer()。我的方法是这样的:
library(tidyverse)
#dummy data
df <- data.frame(Z1 = c(1246,2734), R1 = c(1,5),
Z2 = c(2986,1066), R2 = c(3,7),
Z100 = c(3163,2645), R100 = c(4,8))
# intermediate data.frame
idf <- df %>%
# add row numbers as we need them to keep the order
dplyr::mutate(rn = dplyr::row_number()) %>%
# gather is depricated
tidyr::pivot_longer(-rn, names_to = "colu", values_to = "vals") %>%
# extract number from the column names that now are a column as we need them to keep the order also
dplyr::mutate(nr = readr::parse_number(colu))
# select columns with Z and join columns with R acording to row number and numeric part of column names
idf %>%
dplyr::filter(stringr::str_detect(colu, "Z")) %>%
dplyr::left_join(idf %>%
dplyr::filter(stringr::str_detect(colu, "R")), by = c("rn", "nr")) %>%
# order to get the exact output you are looking for
dplyr::arrange(nr) %>%
# select and rename to get exactout put you a looking vor
dplyr::select(Z = colu.x, `Time in ms` = vals.x, R = colu.y, Reaction = vals.y)
# A tibble: 6 x 4
Z `Time in ms` R Reaction
<chr> <dbl> <chr> <dbl>
1 Z1 1246 R1 1
2 Z1 2734 R1 5
3 Z2 2986 R2 3
4 Z2 1066 R2 7
5 Z100 3163 R100 4
6 Z100 2645 R100 8
另一种方式:
df <- data.frame(Z1 = c(1246,2734), R1 = c(1,5),
Z2 = c(2986,1066), R2 = c(3,7),
Z100 = c(3163,2645), R100 = c(4,8))
library(dplyr)
df |>
tidyr::pivot_longer(everything(), names_to = c(".value", "Ind"), names_pattern = "(.)(\d.*)") |>
rename(`Time in ms` = Z, Reaction = R) |>
mutate(Z = paste0("Z", Ind), R = paste0("R", Ind)) |>
select(Z, `Time in ms`, R, Reaction)
# A tibble: 6 x 4
Z `Time in ms` R Reaction
<chr> <dbl> <chr> <dbl>
1 Z1 1246 R1 1
2 Z2 2986 R2 3
3 Z100 3163 R100 4
4 Z1 2734 R1 5
5 Z2 1066 R2 7
6 Z100 2645 R100 8
我有一个 table 看起来像这样:
Z1 | R1 | Z2 | R2 | ... | Z100 | R100 |
---|---|---|---|---|---|---|
1246 | 1 | 2986 | 3 | ... | 3163 | 4 |
2734 | 5 | 1066 | 7 | ... | 2645 | 8 |
它是一个宽 table,我想将其转换成这样的长 table:
Z | 时间(毫秒) | R | 反应 |
---|---|---|---|
Z1 | 1246 | R1 | 1 |
Z1 | 2734 | R1 | 5 |
Z2 | 2986 | R2 | 3 |
Z2 | 1066 | R2 | 7 |
... | ... | ... | ... |
Z100 | 3163 | R100 | 4 |
Z100 | 2645 | R100 | 8 |
我试过这个:
data_time_config_long <- data_time_config %>%
gather(key = "Z", value = "Time in ms", select(data_time_config, starts_with('Z'))) %>%
gather(key = "R", value = "Reaction", select(data_time_config, starts_with('R')))
我收到这个错误:
Error: Must subset columns with a valid subscript vector. x Subscript has the wrong type `tbl_df< Z1 : double Z2 : double Z3 : double Z4 : double Z5 : double Z6 : double Z7 : double Z8 : double Z9 : double Z10 : double Z11 : double Z12 : double Z13 : double Z14 : double Z15 : double Z16 : double Z17 : double Z18 : double Z19 : double Z20 : double Z21 : double Z22 : double Z23 : double Z24 : double Z25 : double Z26 : double Z27 : double Z28 : double Z29 : double Z30 : double Z31 : double Z32 : double Z33 : double Z34 : double Z35 : double Z36 : double Z37 : double Z38 : double Z39 : double Z40 : double Z41 : double Z42 : double Z43 : double Z44 : double Z45 : double Z46 : double Z47 : double Z48 : double Z49 : double Z50 : double Z51 : double Z52 : double Z53 : double Z54 : double Z55 : double Z56 : double Z57 : double Z58 : double Z59 : double Z60
我做错了什么?
我认为这个问题不能直接用 tidyr::gather() 解决,它已被弃用,应该使用 tidyr::pivot_longer()。我的方法是这样的:
library(tidyverse)
#dummy data
df <- data.frame(Z1 = c(1246,2734), R1 = c(1,5),
Z2 = c(2986,1066), R2 = c(3,7),
Z100 = c(3163,2645), R100 = c(4,8))
# intermediate data.frame
idf <- df %>%
# add row numbers as we need them to keep the order
dplyr::mutate(rn = dplyr::row_number()) %>%
# gather is depricated
tidyr::pivot_longer(-rn, names_to = "colu", values_to = "vals") %>%
# extract number from the column names that now are a column as we need them to keep the order also
dplyr::mutate(nr = readr::parse_number(colu))
# select columns with Z and join columns with R acording to row number and numeric part of column names
idf %>%
dplyr::filter(stringr::str_detect(colu, "Z")) %>%
dplyr::left_join(idf %>%
dplyr::filter(stringr::str_detect(colu, "R")), by = c("rn", "nr")) %>%
# order to get the exact output you are looking for
dplyr::arrange(nr) %>%
# select and rename to get exactout put you a looking vor
dplyr::select(Z = colu.x, `Time in ms` = vals.x, R = colu.y, Reaction = vals.y)
# A tibble: 6 x 4
Z `Time in ms` R Reaction
<chr> <dbl> <chr> <dbl>
1 Z1 1246 R1 1
2 Z1 2734 R1 5
3 Z2 2986 R2 3
4 Z2 1066 R2 7
5 Z100 3163 R100 4
6 Z100 2645 R100 8
另一种方式:
df <- data.frame(Z1 = c(1246,2734), R1 = c(1,5),
Z2 = c(2986,1066), R2 = c(3,7),
Z100 = c(3163,2645), R100 = c(4,8))
library(dplyr)
df |>
tidyr::pivot_longer(everything(), names_to = c(".value", "Ind"), names_pattern = "(.)(\d.*)") |>
rename(`Time in ms` = Z, Reaction = R) |>
mutate(Z = paste0("Z", Ind), R = paste0("R", Ind)) |>
select(Z, `Time in ms`, R, Reaction)
# A tibble: 6 x 4
Z `Time in ms` R Reaction
<chr> <dbl> <chr> <dbl>
1 Z1 1246 R1 1
2 Z2 2986 R2 3
3 Z100 3163 R100 4
4 Z1 2734 R1 5
5 Z2 1066 R2 7
6 Z100 2645 R100 8