合并数据框并将 R 列值扩展到列 Headers
Merge Data Frame and Expand R Column Values To Column Headers
我有 2 个 R 数据框,如下所示:
数据框 1:
identifier
ef_posterior
position_no
11111
0.260
1
11111
0.0822
2
11111
0.00797
3
11111
0.04
4
11111
0.245
5
11111
0.432
6
11112
0.342
1
11112
0.453
2
11112
0.0032
3
11112
0.241
5
11112
0.0422
6
11112
0.311
4
数据框 2:
study_identifier
%LVEF
11111
62
11112
76
我想将这两个数据框合并并重新排列成如下所示:
这里,study_identifier 和 identifier 是一回事(只是不同的列名)
identifier
pos_1
pos_2
pos_3
pos_4
pos_5
pos_6
%LVEF
11111
0.260
0.0822
0.00797
0.04
0.245
0.432
62
11112
0.342
0.453
0.0032
0.311
0.241
0.0422
76
我该怎么做?
非常感谢任何帮助!
这样做
library(dplyr)
library(tidyr)
df1 %>% mutate(position_no = paste0("position_", position_no)) %>%
pivot_wider(id_cols = identifier, names_from = position_no, values_from = ef_posterior) %>%
left_join(df2, by = c("identifier" = "study_identifier"))
# A tibble: 2 x 8
identifier position_1 position_2 position_3 position_4 position_5 position_6 LVEF
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 11111 0.26 0.0822 0.00797 0.04 0.245 0.432 62
2 11112 0.342 0.453 0.0032 0.311 0.241 0.0422 76
使用的数据
df1 <- read.table(text = "identifier ef_posterior position_no
1 11111 0.260 1
2 11111 0.0822 2
3 11111 0.00797 3
4 11111 0.04 4
5 11111 0.245 5
6 11111 0.432 6
7 11112 0.342 1
8 11112 0.453 2
9 11112 0.0032 3
10 11112 0.241 5
11 11112 0.0422 6
12 11112 0.311 4", header = T)
df2 <- read.table(text = "study_identifier LVEF
1 11111 62
2 11112 76", header = T)
鉴于下面的评论请注意-
#when identifier in df2 is factor
df2$study_identifier <- factor(df2$study_identifier)
#use this code
df1 %>% mutate(position_no = paste0("position_", position_no)) %>%
pivot_wider(id_cols = identifier, names_from = position_no, values_from = ef_posterior) %>%
left_join(df2 %>% mutate(study_identifier = as.numeric(as.character(study_identifier))), by = c("identifier" = "study_identifier"))
# A tibble: 2 x 8
identifier position_1 position_2 position_3 position_4 position_5 position_6 LVEF
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 11111 0.26 0.0822 0.00797 0.04 0.245 0.432 62
2 11112 0.342 0.453 0.0032 0.311 0.241 0.0422 76
我有 2 个 R 数据框,如下所示:
数据框 1:
identifier | ef_posterior | position_no |
---|---|---|
11111 | 0.260 | 1 |
11111 | 0.0822 | 2 |
11111 | 0.00797 | 3 |
11111 | 0.04 | 4 |
11111 | 0.245 | 5 |
11111 | 0.432 | 6 |
11112 | 0.342 | 1 |
11112 | 0.453 | 2 |
11112 | 0.0032 | 3 |
11112 | 0.241 | 5 |
11112 | 0.0422 | 6 |
11112 | 0.311 | 4 |
数据框 2:
study_identifier | %LVEF |
---|---|
11111 | 62 |
11112 | 76 |
我想将这两个数据框合并并重新排列成如下所示: 这里,study_identifier 和 identifier 是一回事(只是不同的列名)
identifier | pos_1 | pos_2 | pos_3 | pos_4 | pos_5 | pos_6 | %LVEF |
---|---|---|---|---|---|---|---|
11111 | 0.260 | 0.0822 | 0.00797 | 0.04 | 0.245 | 0.432 | 62 |
11112 | 0.342 | 0.453 | 0.0032 | 0.311 | 0.241 | 0.0422 | 76 |
我该怎么做? 非常感谢任何帮助!
这样做
library(dplyr)
library(tidyr)
df1 %>% mutate(position_no = paste0("position_", position_no)) %>%
pivot_wider(id_cols = identifier, names_from = position_no, values_from = ef_posterior) %>%
left_join(df2, by = c("identifier" = "study_identifier"))
# A tibble: 2 x 8
identifier position_1 position_2 position_3 position_4 position_5 position_6 LVEF
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 11111 0.26 0.0822 0.00797 0.04 0.245 0.432 62
2 11112 0.342 0.453 0.0032 0.311 0.241 0.0422 76
使用的数据
df1 <- read.table(text = "identifier ef_posterior position_no
1 11111 0.260 1
2 11111 0.0822 2
3 11111 0.00797 3
4 11111 0.04 4
5 11111 0.245 5
6 11111 0.432 6
7 11112 0.342 1
8 11112 0.453 2
9 11112 0.0032 3
10 11112 0.241 5
11 11112 0.0422 6
12 11112 0.311 4", header = T)
df2 <- read.table(text = "study_identifier LVEF
1 11111 62
2 11112 76", header = T)
鉴于下面的评论请注意-
#when identifier in df2 is factor
df2$study_identifier <- factor(df2$study_identifier)
#use this code
df1 %>% mutate(position_no = paste0("position_", position_no)) %>%
pivot_wider(id_cols = identifier, names_from = position_no, values_from = ef_posterior) %>%
left_join(df2 %>% mutate(study_identifier = as.numeric(as.character(study_identifier))), by = c("identifier" = "study_identifier"))
# A tibble: 2 x 8
identifier position_1 position_2 position_3 position_4 position_5 position_6 LVEF
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 11111 0.26 0.0822 0.00797 0.04 0.245 0.432 62
2 11112 0.342 0.453 0.0032 0.311 0.241 0.0422 76