使用 R 中的计数和百分比将长数据转换为宽数据
Pivoting Long Data to Wide data with Counts and Percentages in R
我需要从长格式到宽格式获取一组数据,显示计数和相对百分比。下面是一些类似于我的情况的虚拟数据:
df <-tibble::tribble(
~YEAR, ~Volunteers, ~retained, ~n, ~Rel.Percentage,
2016, "LA", "N", 51, "7%",
2016, "LA", "Y", 685, "93%",
2017, "Victorville", "N", 12, "16%",
2017, "Victorville", "Y", 66, "84%",
2018, "Inland Empire", "N", 33, "13%",
2018, "Inland Empire", "Y", 227, "87%",
2019, "Kern County", "N", 5, "7%",
2019, "Kern County", "Y", 69, "93%",
2020, "Military", "N", 61, "20%",
2020, "Military", "Y", 243, "80%",
2017, "LA", "N", 59, "7%",
2017, "LA", "Y", 645, "93%",
2016, "Victorville", "N", 15, "16%",
2016, "Victorville", "Y", 64, "84%",
2019, "Inland Empire", "N", 32, "13%",
2019, "Inland Empire", "Y", 221, "87%",
2017, "Kern County", "N", 7, "7%",
2017, "Kern County", "Y", 73, "93%",
2016, "Military", "N", 63, "20%",
2016, "Military", "Y", 241, "80%"
)
wide.test <-df %>%
pivot_wider(names_from = YEAR, values_from = c(`Rel.Percentage`) )
这给出了 NA 的偏移 table,但我希望数字和相对百分比并排显示。
new.wide <-df[, !(names(df) %in% c("n"))] %>%
pivot_wider(names_from = YEAR, values_from = `Rel.Percentage`)
这让我的百分比更整洁 table,但不显示 n
我也试过:
newer.wide <-df %>%
pivot_wider(names_from = YEAR, values_from = c(`Rel.Percentage`, n) )
但是这个解决方案的问题是 n 现在在它们自己的一组列上,这使得这在很大程度上不可读。我希望将数字与它们的相对百分比放在一起,如果可能的话,在括号中显示相对百分比旁边的基础数字。
在数据透视之前处理您的数据。使用 paste0(df$n, " (", df$Rel.Percentage, "%)")
之类的东西创建一个新变量,然后在 values_from
参数中使用该变量。
编辑:
让它更像一个完整的例子:
#this pastes the characters into the "fixed" column with both n and the percentage
df$fixed <- paste0(df$n, " (", df$Rel.Percentage, ")")
#this creates the table in a wide format with percentages across the years, while dropping the unnecessary columns for a cleaner look
df.wide <-df[, !(names(df) %in% c("n","Rel.Percentage"))] %>%
pivot_wider(names_from = YEAR, values_from =fixed )
因此,可以将其导出到 LaTeX table 中以便于编写报告:
print(xtable(df.wide, type = "latex"), file = "df_wide.tex")
我需要从长格式到宽格式获取一组数据,显示计数和相对百分比。下面是一些类似于我的情况的虚拟数据:
df <-tibble::tribble(
~YEAR, ~Volunteers, ~retained, ~n, ~Rel.Percentage,
2016, "LA", "N", 51, "7%",
2016, "LA", "Y", 685, "93%",
2017, "Victorville", "N", 12, "16%",
2017, "Victorville", "Y", 66, "84%",
2018, "Inland Empire", "N", 33, "13%",
2018, "Inland Empire", "Y", 227, "87%",
2019, "Kern County", "N", 5, "7%",
2019, "Kern County", "Y", 69, "93%",
2020, "Military", "N", 61, "20%",
2020, "Military", "Y", 243, "80%",
2017, "LA", "N", 59, "7%",
2017, "LA", "Y", 645, "93%",
2016, "Victorville", "N", 15, "16%",
2016, "Victorville", "Y", 64, "84%",
2019, "Inland Empire", "N", 32, "13%",
2019, "Inland Empire", "Y", 221, "87%",
2017, "Kern County", "N", 7, "7%",
2017, "Kern County", "Y", 73, "93%",
2016, "Military", "N", 63, "20%",
2016, "Military", "Y", 241, "80%"
)
wide.test <-df %>%
pivot_wider(names_from = YEAR, values_from = c(`Rel.Percentage`) )
这给出了 NA 的偏移 table,但我希望数字和相对百分比并排显示。
new.wide <-df[, !(names(df) %in% c("n"))] %>%
pivot_wider(names_from = YEAR, values_from = `Rel.Percentage`)
这让我的百分比更整洁 table,但不显示 n
我也试过:
newer.wide <-df %>%
pivot_wider(names_from = YEAR, values_from = c(`Rel.Percentage`, n) )
但是这个解决方案的问题是 n 现在在它们自己的一组列上,这使得这在很大程度上不可读。我希望将数字与它们的相对百分比放在一起,如果可能的话,在括号中显示相对百分比旁边的基础数字。
在数据透视之前处理您的数据。使用 paste0(df$n, " (", df$Rel.Percentage, "%)")
之类的东西创建一个新变量,然后在 values_from
参数中使用该变量。
编辑: 让它更像一个完整的例子:
#this pastes the characters into the "fixed" column with both n and the percentage
df$fixed <- paste0(df$n, " (", df$Rel.Percentage, ")")
#this creates the table in a wide format with percentages across the years, while dropping the unnecessary columns for a cleaner look
df.wide <-df[, !(names(df) %in% c("n","Rel.Percentage"))] %>%
pivot_wider(names_from = YEAR, values_from =fixed )
因此,可以将其导出到 LaTeX table 中以便于编写报告:
print(xtable(df.wide, type = "latex"), file = "df_wide.tex")