如何在 r 中使用 map/apply 函数族对所有变量应用 count()?
How to apply count() on all variables using a map/apply functions family in r?
我正在尝试创建一个用户定义的函数来显示数据帧的每个变量中的频率计数。
df
dummy_df <- data.frame(gender_vector = c("Male", "Female", "Female", "Male", "Male"),
color_vector = c('blue', 'red', 'green', 'white', 'black')
)
dummy_df
gender_vector color_vector
1 Male blue
2 Female red
3 Female green
4 Male white
5 Male black
运行 单变量计数:
dummy_df %>%
count(gender_vector) %>%
as.tibble() %>%
ggplot(aes(x = n, y = gender_vector, fill = gender_vector)) +
geom_col(show.legend = FALSE)
问题: 但是当我使用它创建一个函数时,它会产生一个问题:
var_freq_plot_fn <- function(df, selected_var){
df %>%
select_if(is.character) %>%
count(selected_var) %>%
as.tibble() %>%
ggplot(aes(x = n, y = selected_var, fill = selected_var)) +
geom_col() +
theme(legend.position = "none")
}
map(dummy_df, var_freq_plot_fn)
错误: Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars' applied to an object of class "character"
.
我认为使用 tibble
而不是 dataframe
可以解决这个问题,但我错了。
我仍然不清楚为什么 r
datatypes
会在将内容放入 function
.
时产生问题
当它们发挥作用时,情况就不同了,尤其是当您按名称而不是按值引用它们时。尝试:
library(dplyr)
library(ggplot2)
var_freq_plot_fn <- function(df, selected_var){
df %>%
count(.data[[selected_var]]) %>%
ggplot(aes(x = n, y = .data[[selected_var]], fill = .data[[selected_var]])) +
geom_col() +
theme(legend.position = "none")
}
plot_list <- purrr::map(names(dummy_df), var_freq_plot_fn, df = dummy_df)
函数的另一种写法:
var_freq_plot_fn <- function(df){
purrr::map(df %>% select_if(is.character) %>% names, ~
df %>%
count(.data[[.x]]) %>%
ggplot(aes(x = n, y = .data[[.x]], fill = .data[[.x]])) +
geom_col() +
theme(legend.position = "none"))
}
var_freq_plot_fn(dummy_df)
我正在尝试创建一个用户定义的函数来显示数据帧的每个变量中的频率计数。
df
dummy_df <- data.frame(gender_vector = c("Male", "Female", "Female", "Male", "Male"),
color_vector = c('blue', 'red', 'green', 'white', 'black')
)
dummy_df
gender_vector color_vector
1 Male blue
2 Female red
3 Female green
4 Male white
5 Male black
运行 单变量计数:
dummy_df %>%
count(gender_vector) %>%
as.tibble() %>%
ggplot(aes(x = n, y = gender_vector, fill = gender_vector)) +
geom_col(show.legend = FALSE)
问题: 但是当我使用它创建一个函数时,它会产生一个问题:
var_freq_plot_fn <- function(df, selected_var){
df %>%
select_if(is.character) %>%
count(selected_var) %>%
as.tibble() %>%
ggplot(aes(x = n, y = selected_var, fill = selected_var)) +
geom_col() +
theme(legend.position = "none")
}
map(dummy_df, var_freq_plot_fn)
错误: Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars' applied to an object of class "character"
.
我认为使用 tibble
而不是 dataframe
可以解决这个问题,但我错了。
我仍然不清楚为什么 r
datatypes
会在将内容放入 function
.
当它们发挥作用时,情况就不同了,尤其是当您按名称而不是按值引用它们时。尝试:
library(dplyr)
library(ggplot2)
var_freq_plot_fn <- function(df, selected_var){
df %>%
count(.data[[selected_var]]) %>%
ggplot(aes(x = n, y = .data[[selected_var]], fill = .data[[selected_var]])) +
geom_col() +
theme(legend.position = "none")
}
plot_list <- purrr::map(names(dummy_df), var_freq_plot_fn, df = dummy_df)
函数的另一种写法:
var_freq_plot_fn <- function(df){
purrr::map(df %>% select_if(is.character) %>% names, ~
df %>%
count(.data[[.x]]) %>%
ggplot(aes(x = n, y = .data[[.x]], fill = .data[[.x]])) +
geom_col() +
theme(legend.position = "none"))
}
var_freq_plot_fn(dummy_df)