R:对应用 purrr reduce 的列 after/before 使用数据框名称
R: Use data frame names for columns after/before applying purrr reduce
我已经检查过 ,但不幸的是,它不适合我更复杂的数据。
原始数据:
我有一个名为 Total.Scores
的列表,其中包含 11 个名为
2000-2020
每个都包含从 2000 年到 2020 年的年度数据。每个数据框都有不同的行数但总是 12 列:ID
、Category
、Score.1-9
和 Year
.
示例数据:
library(purrr)
Total.Scores <- list("2020" = data.frame(ID = c("A2_101", "B3_102", "LO_103", "TT_101"),
Category = c("blue", "red", "green", "red"),
Score.1 = c(1,2,3,0),
Score.2 = c(3,4,5,2),
Score.3 = c(0,0,1,1),
Year = c(2020, 2020, 2020, 2020)),
"2019" = data.frame(ID = c("A2_101", "B3_102", "LO_103"),
Category = c("blue", "red", "green"),
Score.1 = c(1,2,3),
Score.2 = c(3,4,5),
Score.3 = c(0,0,1),
Year = c(2019, 2019, 2019)),
"2018" = data.frame(ID = c("A2_101", "B3_102", "LO_103", "TT_201","AA_345"),
Category = c("blue", "red", "green", "yellow", "purple"),
Score.1 = c(1,2,3,3,5),
Score.2 = c(3,4,5,5,3),
Score.3 = c(0,0,1,3,0),
Year = c(2018, 2018, 2018, 2018, 2018)),
"2017" = data.frame(ID = c("A2_101", "B3_102", "LO_103", "TT_101"),
Category = c("blue", "red", "green", "red"),
Score.1 = c(1,2,3,0),
Score.2 = c(3,4,5,2),
Score.3 = c(0,0,1,1),
Year = c(2017, 2017, 2017, 2017)))
合并数据:
我通过 ID
和 Category
的 full_join
将 Total.Scores
列表中的数据帧合并到新的大数据帧 Total.Yearly.Scores
中:
Total.Yearly.Scores <- Total.Scores %>% reduce(full_join, by = c("ID", "Category"))
结果:
# Total.Yearly.Scores
ID Category Score.1.x Score.2.x Score.3.x Year.x Score.1.y Score.2.y Score.3.y Year.y Score.1.x.x Score.2.x.x Score.3.x.x Year.x.x
1 A2_101 blue 1 3 0 2020 1 3 0 2019 1 3 0 2018
2 B3_102 red 2 4 0 2020 2 4 0 2019 2 4 0 2018
3 LO_103 green 3 5 1 2020 3 5 1 2019 3 5 1 2018
4 TT_101 red 0 2 1 2020 NA NA NA NA NA NA NA NA
5 TT_201 yellow NA NA NA NA NA NA NA NA 3 5 3 2018
6 AA_345 purple NA NA NA NA NA NA NA NA 5 3 0 2018
Score.1.y.y Score.2.y.y Score.3.y.y Year.y.y
1 1 3 0 2017
2 2 4 0 2017
3 3 5 1 2017
4 0 2 1 2017
5 NA NA NA NA
6 NA NA NA NA
问题:
如何调整我的代码,以便 Score.1-9
和 Year
列的 headers 列包含 2000-2020
的数据框名称?
例如,将它们从 Score.1.x
更改为 Score.1 2020
:
# Total.Yearly.Scores
ID Category Score.1 2020 Score.2 2020 Score.3 2020 Year 2020 Score.1 2019 Score.2 2019 Score.3 2019 Year 2019 Score.1 2018 Score.2 2018 Score.3 2018 Year 2018
1 A2_101 blue 1 3 0 2020 1 3 0 2019 1 3 0 2018
2 B3_102 red 2 4 0 2020 2 4 0 2019 2 4 0 2018
3 LO_103 green 3 5 1 2020 3 5 1 2019 3 5 1 2018
4 TT_101 red 0 2 1 2020 NA NA NA NA NA NA NA NA
5 TT_201 yellow NA NA NA NA NA NA NA NA 3 5 3 2018
6 AA_345 purple NA NA NA NA NA NA NA NA 5 3 0 2018
Score.1 2017 Score.2 2017 Score.3 2017 Year 2017
1 1 3 0 2017
2 2 4 0 2017
3 3 5 1 2017
4 0 2 1 2017
5 NA NA NA NA
6 NA NA NA NA
在此先感谢您的帮助!
最好的问候,托马斯。
我们可以rename
加入
library(dplyr)
library(purrr)
library(stringr)
Total.Scores %>%
imap(~ {nm1 <- .y
rename_at(.x, vars(-c("ID", "Category")), ~ str_c(., nm1, sep= ' '))}) %>%
reduce(full_join, by = c("ID", "Category"))
-输出
ID Category Score.1 2020 Score.2 2020 Score.3 2020 Year 2020 Score.1 2019 Score.2 2019 Score.3 2019
1 A2_101 blue 1 3 0 2020 1 3 0
2 B3_102 red 2 4 0 2020 2 4 0
3 LO_103 green 3 5 1 2020 3 5 1
4 TT_101 red 0 2 1 2020 NA NA NA
5 TT_201 yellow NA NA NA NA NA NA NA
6 AA_345 purple NA NA NA NA NA NA NA
Year 2019 Score.1 2018 Score.2 2018 Score.3 2018 Year 2018 Score.1 2017 Score.2 2017 Score.3 2017 Year 2017
1 2019 1 3 0 2018 1 3 0 2017
2 2019 2 4 0 2018 2 4 0 2017
3 2019 3 5 1 2018 3 5 1 2017
4 NA NA NA NA NA 0 2 1 2017
5 NA 3 5 3 2018 NA NA NA NA
6 NA 5 3 0 2018 NA NA NA NA
我已经检查过
原始数据:
我有一个名为 Total.Scores
的列表,其中包含 11 个名为
2000-2020
每个都包含从 2000 年到 2020 年的年度数据。每个数据框都有不同的行数但总是 12 列:ID
、Category
、Score.1-9
和 Year
.
示例数据:
library(purrr)
Total.Scores <- list("2020" = data.frame(ID = c("A2_101", "B3_102", "LO_103", "TT_101"),
Category = c("blue", "red", "green", "red"),
Score.1 = c(1,2,3,0),
Score.2 = c(3,4,5,2),
Score.3 = c(0,0,1,1),
Year = c(2020, 2020, 2020, 2020)),
"2019" = data.frame(ID = c("A2_101", "B3_102", "LO_103"),
Category = c("blue", "red", "green"),
Score.1 = c(1,2,3),
Score.2 = c(3,4,5),
Score.3 = c(0,0,1),
Year = c(2019, 2019, 2019)),
"2018" = data.frame(ID = c("A2_101", "B3_102", "LO_103", "TT_201","AA_345"),
Category = c("blue", "red", "green", "yellow", "purple"),
Score.1 = c(1,2,3,3,5),
Score.2 = c(3,4,5,5,3),
Score.3 = c(0,0,1,3,0),
Year = c(2018, 2018, 2018, 2018, 2018)),
"2017" = data.frame(ID = c("A2_101", "B3_102", "LO_103", "TT_101"),
Category = c("blue", "red", "green", "red"),
Score.1 = c(1,2,3,0),
Score.2 = c(3,4,5,2),
Score.3 = c(0,0,1,1),
Year = c(2017, 2017, 2017, 2017)))
合并数据:
我通过 ID
和 Category
的 full_join
将 Total.Scores
列表中的数据帧合并到新的大数据帧 Total.Yearly.Scores
中:
Total.Yearly.Scores <- Total.Scores %>% reduce(full_join, by = c("ID", "Category"))
结果:
# Total.Yearly.Scores
ID Category Score.1.x Score.2.x Score.3.x Year.x Score.1.y Score.2.y Score.3.y Year.y Score.1.x.x Score.2.x.x Score.3.x.x Year.x.x
1 A2_101 blue 1 3 0 2020 1 3 0 2019 1 3 0 2018
2 B3_102 red 2 4 0 2020 2 4 0 2019 2 4 0 2018
3 LO_103 green 3 5 1 2020 3 5 1 2019 3 5 1 2018
4 TT_101 red 0 2 1 2020 NA NA NA NA NA NA NA NA
5 TT_201 yellow NA NA NA NA NA NA NA NA 3 5 3 2018
6 AA_345 purple NA NA NA NA NA NA NA NA 5 3 0 2018
Score.1.y.y Score.2.y.y Score.3.y.y Year.y.y
1 1 3 0 2017
2 2 4 0 2017
3 3 5 1 2017
4 0 2 1 2017
5 NA NA NA NA
6 NA NA NA NA
问题:
如何调整我的代码,以便 Score.1-9
和 Year
列的 headers 列包含 2000-2020
的数据框名称?
例如,将它们从 Score.1.x
更改为 Score.1 2020
:
# Total.Yearly.Scores
ID Category Score.1 2020 Score.2 2020 Score.3 2020 Year 2020 Score.1 2019 Score.2 2019 Score.3 2019 Year 2019 Score.1 2018 Score.2 2018 Score.3 2018 Year 2018
1 A2_101 blue 1 3 0 2020 1 3 0 2019 1 3 0 2018
2 B3_102 red 2 4 0 2020 2 4 0 2019 2 4 0 2018
3 LO_103 green 3 5 1 2020 3 5 1 2019 3 5 1 2018
4 TT_101 red 0 2 1 2020 NA NA NA NA NA NA NA NA
5 TT_201 yellow NA NA NA NA NA NA NA NA 3 5 3 2018
6 AA_345 purple NA NA NA NA NA NA NA NA 5 3 0 2018
Score.1 2017 Score.2 2017 Score.3 2017 Year 2017
1 1 3 0 2017
2 2 4 0 2017
3 3 5 1 2017
4 0 2 1 2017
5 NA NA NA NA
6 NA NA NA NA
在此先感谢您的帮助! 最好的问候,托马斯。
我们可以rename
加入
library(dplyr)
library(purrr)
library(stringr)
Total.Scores %>%
imap(~ {nm1 <- .y
rename_at(.x, vars(-c("ID", "Category")), ~ str_c(., nm1, sep= ' '))}) %>%
reduce(full_join, by = c("ID", "Category"))
-输出
ID Category Score.1 2020 Score.2 2020 Score.3 2020 Year 2020 Score.1 2019 Score.2 2019 Score.3 2019
1 A2_101 blue 1 3 0 2020 1 3 0
2 B3_102 red 2 4 0 2020 2 4 0
3 LO_103 green 3 5 1 2020 3 5 1
4 TT_101 red 0 2 1 2020 NA NA NA
5 TT_201 yellow NA NA NA NA NA NA NA
6 AA_345 purple NA NA NA NA NA NA NA
Year 2019 Score.1 2018 Score.2 2018 Score.3 2018 Year 2018 Score.1 2017 Score.2 2017 Score.3 2017 Year 2017
1 2019 1 3 0 2018 1 3 0 2017
2 2019 2 4 0 2018 2 4 0 2017
3 2019 3 5 1 2018 3 5 1 2017
4 NA NA NA NA NA 0 2 1 2017
5 NA 3 5 3 2018 NA NA NA NA
6 NA 5 3 0 2018 NA NA NA NA