如何根据给定的数据集自动分配箱线图图表限制
How automatically assign the boxplot chart limits from the given the dataset
我正在使用这个 data set to construct the following chart 。
在此图表中,我手动分配了图表 Y 轴的限制:
coord_cartesian(ylim=c(-20,80))
现在,我想通过代码(而不是手动)分配这些限制,以便使用新的给定数据集,图表具有可以很好地捕捉分位数框及其垂直线的限制,同时忽略被描述为的异常值图表上的灰点。谢谢!
P.S。数据集imf.inflation
:
country_iso_code year value
X2003 AFG 2003 35.663
X2004 AFG 2004 16.358
X2005 AFG 2005 10.569
X2006 AFG 2006 6.785
X2007 AFG 2007 8.681
X2008 AFG 2008 26.419
我的代码:
r <- .005
imf.inflation_trcd <- imf.inflation %>%
group_by(year) %>%
filter(value <= quantile(value, c(r, 1-r)))
p1 <- ggplot() +
geom_boxplot(data = imf.inflation, aes(x = year, y = value, group = year),
outlier.shape = 1, outlier.color = "grey") +
geom_smooth(data = imf.inflation_trcd, aes(x=year, y=value),
method = "loess", se=TRUE, color = "orange") +
coord_cartesian(ylim=c(-20,80)) +
labs(x = "", y = "Average Yearly Inflation (% year-on-year)",
title = "Distribution of Inflation*",
subtitle = "Over 1980-2020 and across IMF member countries",
caption = paste0("* Smoothing line is an estimation using LOESS method and based upon truncated dataset (",
(1-r)*100, "% percentile). Source: IMF.")) +
theme(axis.title.y = element_text(size = 9),
legend.title = element_blank(),
legend.background=element_blank(), legend.position = "right",
plot.caption = element_text(hjust = 0, size = 8))
策略: (1) 运行 boxplot()
(from base R) 用适当的公式,保存结果。 (2)提取结果的$stats
元素的第一行和最后(第五)行,即胡须的bottom/top;找到最小值和最大值(请参阅 ?boxplot
中 return 值的详细信息)。 (3) 将这些值放入您的 coord_cartesian()
调用中。
dd <- read.csv("Inflation.csv")
bb <- boxplot(value~year, data=dd)
br <- c(min(bb$stats[1,]), max(bb$stats[5,]))
ggplot(dd, aes(year,value, group=year)) +
geom_boxplot() +
coord_cartesian(ylim=br)
我正在使用这个 data set to construct the following chart
coord_cartesian(ylim=c(-20,80))
现在,我想通过代码(而不是手动)分配这些限制,以便使用新的给定数据集,图表具有可以很好地捕捉分位数框及其垂直线的限制,同时忽略被描述为的异常值图表上的灰点。谢谢!
P.S。数据集imf.inflation
:
country_iso_code year value
X2003 AFG 2003 35.663
X2004 AFG 2004 16.358
X2005 AFG 2005 10.569
X2006 AFG 2006 6.785
X2007 AFG 2007 8.681
X2008 AFG 2008 26.419
我的代码:
r <- .005
imf.inflation_trcd <- imf.inflation %>%
group_by(year) %>%
filter(value <= quantile(value, c(r, 1-r)))
p1 <- ggplot() +
geom_boxplot(data = imf.inflation, aes(x = year, y = value, group = year),
outlier.shape = 1, outlier.color = "grey") +
geom_smooth(data = imf.inflation_trcd, aes(x=year, y=value),
method = "loess", se=TRUE, color = "orange") +
coord_cartesian(ylim=c(-20,80)) +
labs(x = "", y = "Average Yearly Inflation (% year-on-year)",
title = "Distribution of Inflation*",
subtitle = "Over 1980-2020 and across IMF member countries",
caption = paste0("* Smoothing line is an estimation using LOESS method and based upon truncated dataset (",
(1-r)*100, "% percentile). Source: IMF.")) +
theme(axis.title.y = element_text(size = 9),
legend.title = element_blank(),
legend.background=element_blank(), legend.position = "right",
plot.caption = element_text(hjust = 0, size = 8))
策略: (1) 运行 boxplot()
(from base R) 用适当的公式,保存结果。 (2)提取结果的$stats
元素的第一行和最后(第五)行,即胡须的bottom/top;找到最小值和最大值(请参阅 ?boxplot
中 return 值的详细信息)。 (3) 将这些值放入您的 coord_cartesian()
调用中。
dd <- read.csv("Inflation.csv")
bb <- boxplot(value~year, data=dd)
br <- c(min(bb$stats[1,]), max(bb$stats[5,]))
ggplot(dd, aes(year,value, group=year)) +
geom_boxplot() +
coord_cartesian(ylim=br)