将垂直数据框转换为水平数据框 - R
Converting a Vertical Data Frame to a Horizontal one - R
下面是我正在处理的数据类型的一个小示例。真实数据集大约有 800 万行,Stat_Name 列中有大约 450 个统计名称,每个日期从 2019 年 8 月 31 日到 2020 年 1 月 10 日。我需要做的就是获得一个将下面的 df1 转换为 df2 的函数。我敢打赌这相当简单,我认为 melt() 可能能够做到这一点,但我不确定。提前感谢您的所有帮助!谢谢
rm(list=ls())
df1 <- data.frame(Team_Code = c(728,728,728,728),
Coef_Name = c('Team_728','Team_728','Team_728','Team_728'),
Year = 2021,
Date = c('8/31/2021','8/31/2021','9/1/2021','9/1/2021'),
Stat_Name = c('Points','OppPoints','Points','OppPoints'),
Ridge_Reg_Coef = c(20,15,22,16),
Adj_Stat_Value = c(21.5,14,20.5,17))
df2 <- data.frame(Team_Code = c(728,728),
Coef_Name = c('Team_728','Team_728'),
Year = 2021,
Date = c('8/31/2021','9/1/2021'),
Points = c(21.5,20.5),
OppPoints = c(14,17))
一个 tidyverse 解决方案
df1 %>%
# Removing Ridge_Reg_Coef
select(-Ridge_Reg_Coef) %>%
# Pivotting data to a wider format, using Stat_name as variable and Adj_Stat_Value as values
pivot_wider(names_from = Stat_Name,values_from = Adj_Stat_Value)
# A tibble: 2 x 6
Team_Code Coef_Name Year Date Points OppPoints
<dbl> <chr> <dbl> <chr> <dbl> <dbl>
1 728 Team_728 2021 8/31/2021 21.5 14
2 728 Team_728 2021 9/1/2021 20.5 17
下面是我正在处理的数据类型的一个小示例。真实数据集大约有 800 万行,Stat_Name 列中有大约 450 个统计名称,每个日期从 2019 年 8 月 31 日到 2020 年 1 月 10 日。我需要做的就是获得一个将下面的 df1 转换为 df2 的函数。我敢打赌这相当简单,我认为 melt() 可能能够做到这一点,但我不确定。提前感谢您的所有帮助!谢谢
rm(list=ls())
df1 <- data.frame(Team_Code = c(728,728,728,728),
Coef_Name = c('Team_728','Team_728','Team_728','Team_728'),
Year = 2021,
Date = c('8/31/2021','8/31/2021','9/1/2021','9/1/2021'),
Stat_Name = c('Points','OppPoints','Points','OppPoints'),
Ridge_Reg_Coef = c(20,15,22,16),
Adj_Stat_Value = c(21.5,14,20.5,17))
df2 <- data.frame(Team_Code = c(728,728),
Coef_Name = c('Team_728','Team_728'),
Year = 2021,
Date = c('8/31/2021','9/1/2021'),
Points = c(21.5,20.5),
OppPoints = c(14,17))
一个 tidyverse 解决方案
df1 %>%
# Removing Ridge_Reg_Coef
select(-Ridge_Reg_Coef) %>%
# Pivotting data to a wider format, using Stat_name as variable and Adj_Stat_Value as values
pivot_wider(names_from = Stat_Name,values_from = Adj_Stat_Value)
# A tibble: 2 x 6
Team_Code Coef_Name Year Date Points OppPoints
<dbl> <chr> <dbl> <chr> <dbl> <dbl>
1 728 Team_728 2021 8/31/2021 21.5 14
2 728 Team_728 2021 9/1/2021 20.5 17