R:ggplot2,如何在面板图的每个面板上注释汇总统计数据
R: ggplot2, how to annotate summary statistics on each panel of a panel plot
如何使用 R 中的 ggplot2 在以下绘图的每个面板中添加标准偏差的文本注释(例如 sd = sd_value)?
library(datasets)
data(mtcars)
ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl) +
theme_bw()
我想要 post 情节的图像,但我没有足够的代表。
我认为 "geom_text" 或 "annotate" 可能有用,但我不确定如何使用。
如果您想在每个方面改变文本标签,您将需要使用 geom_text
。如果你希望相同的文本出现在每个方面,你可以使用 annotate
.
p <- ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl)
mylabels <- data.frame(cyl = c(4, 6, 8),
label = c("first label", "seond label different", "and another"))
p + geom_text(x = 200, y = 0.75, aes(label = label), data = my labels)
### compare that to this way with annotate
p + annotate("text", x = 200, y = 0.75, label = "same label everywhere")
现在,如果在这个例子中你真的想要 cyl
的标准差,我可能会先使用 dplyr
进行计算,然后像这样用 geom_text
完成计算:
library(ggplot2)
library(dplyr)
df.sd.hp <- mtcars %>%
group_by(cyl) %>%
summarise(hp.sd = round(sd(hp), 2))
ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl) +
geom_text(x = 200, y = 0.75,
aes(label = paste0("SD: ", hp.sd)),
data = df.sd.hp)
我更喜欢当统计信息出现在构面标签本身中时的图形外观。我制作了以下脚本,它允许选择显示 标准偏差 、mean 或 count。本质上,它计算汇总统计信息,然后将其与名称合并,以便您具有格式 CATEGORY (SUMMARY STAT = VALUE).
#' Function will update the name with the statistic of your choice
AddNameStat <- function(df, category, count_col, stat = c("sd","mean","count"), dp= 0){
# Create temporary data frame for analysis
temp <- data.frame(ref = df[[category]], comp = df[[count_col]])
# Aggregate the variables and calculate statistics
agg_stats <- plyr::ddply(temp, .(ref), summarize,
sd = sd(comp),
mean = mean(comp),
count = length(comp))
# Dictionary used to replace stat name with correct symbol for plot
labelName <- mapvalues(stat, from=c("sd","mean","count"), to=c("\u03C3", "x", "n"))
# Updates the name based on the selected variable
agg_stats$join <- paste0(agg_stats$ref, " \n (", labelName," = ",
round(agg_stats[[stat]], dp), ")")
# Map the names
name_map <- setNames(agg_stats$join, as.factor(agg_stats$ref))
return(name_map[as.character(df[[category]])])
}
将此脚本用于您的原始问题:
library(datasets)
data(mtcars)
# Update the variable name
mtcars$cyl <- AddNameStat(mtcars, "cyl", "hp", stat = "sd")
ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl) +
theme_bw()
脚本应该很容易更改以包含其他摘要统计信息。我也确信它可以部分重写,使它更干净一些!
如何使用 R 中的 ggplot2 在以下绘图的每个面板中添加标准偏差的文本注释(例如 sd = sd_value)?
library(datasets)
data(mtcars)
ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl) +
theme_bw()
我想要 post 情节的图像,但我没有足够的代表。
我认为 "geom_text" 或 "annotate" 可能有用,但我不确定如何使用。
如果您想在每个方面改变文本标签,您将需要使用 geom_text
。如果你希望相同的文本出现在每个方面,你可以使用 annotate
.
p <- ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl)
mylabels <- data.frame(cyl = c(4, 6, 8),
label = c("first label", "seond label different", "and another"))
p + geom_text(x = 200, y = 0.75, aes(label = label), data = my labels)
### compare that to this way with annotate
p + annotate("text", x = 200, y = 0.75, label = "same label everywhere")
现在,如果在这个例子中你真的想要 cyl
的标准差,我可能会先使用 dplyr
进行计算,然后像这样用 geom_text
完成计算:
library(ggplot2)
library(dplyr)
df.sd.hp <- mtcars %>%
group_by(cyl) %>%
summarise(hp.sd = round(sd(hp), 2))
ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl) +
geom_text(x = 200, y = 0.75,
aes(label = paste0("SD: ", hp.sd)),
data = df.sd.hp)
我更喜欢当统计信息出现在构面标签本身中时的图形外观。我制作了以下脚本,它允许选择显示 标准偏差 、mean 或 count。本质上,它计算汇总统计信息,然后将其与名称合并,以便您具有格式 CATEGORY (SUMMARY STAT = VALUE).
#' Function will update the name with the statistic of your choice
AddNameStat <- function(df, category, count_col, stat = c("sd","mean","count"), dp= 0){
# Create temporary data frame for analysis
temp <- data.frame(ref = df[[category]], comp = df[[count_col]])
# Aggregate the variables and calculate statistics
agg_stats <- plyr::ddply(temp, .(ref), summarize,
sd = sd(comp),
mean = mean(comp),
count = length(comp))
# Dictionary used to replace stat name with correct symbol for plot
labelName <- mapvalues(stat, from=c("sd","mean","count"), to=c("\u03C3", "x", "n"))
# Updates the name based on the selected variable
agg_stats$join <- paste0(agg_stats$ref, " \n (", labelName," = ",
round(agg_stats[[stat]], dp), ")")
# Map the names
name_map <- setNames(agg_stats$join, as.factor(agg_stats$ref))
return(name_map[as.character(df[[category]])])
}
将此脚本用于您的原始问题:
library(datasets)
data(mtcars)
# Update the variable name
mtcars$cyl <- AddNameStat(mtcars, "cyl", "hp", stat = "sd")
ggplot(data = mtcars, aes(x = hp)) +
geom_dotplot(binwidth = 1) +
geom_density() +
facet_grid(. ~ cyl) +
theme_bw()
脚本应该很容易更改以包含其他摘要统计信息。我也确信它可以部分重写,使它更干净一些!