将水平分位数线添加到散点图ggplot2 R
Add horizontal quantile lines to scatter plot ggplot2 R
我有下面的数据
eg_data <- data.frame(
period = c(sample( c("1 + 2"), 1000, replace = TRUE)),
max_sales = c(sample( c(1:10), 1000, replace = TRUE, prob =
c(.05, .10, .15, .25, .25, .10, .05, .02, .02, .01)))
我想制作一个 scatter
(实际上是 jitter
)绘图并在 y 轴的不同点添加水平线。我希望能够自定义添加线条的百分位数,但目前,R 的汇总函数之类的功能就可以正常工作。
summary(eg_data$max_sales)
我在下面有抖动图的代码。它运行并生成图表,但我不断收到错误消息:
Each group consists of only one observation. Do you need to adjust the
group aesthetic?
jitter <- (
(ggplot(data = eg_data, aes(x=period, y=max_sales, group = 1)) +
geom_jitter(stat = "identity", width = .15, color = "blue", alpha = .4)) +
scale_y_continuous(breaks= seq(0,12, by=1)) +
geom_line(stat = 'summary', fun.y = "quantile", fun.args=list(probs=0.1)) +
ggtitle("Distribution of Sales by Period") + xlab("Period") + ylab("Sales") +
theme(plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5),
axis.title.x = element_text(color = "black", size = 12, face = "bold"),
axis.title.y = element_text(color = "black", size = 12, face = "bold")) +
labs(fill = "Period") )
jitter
我试着看这个问题 -
ggplot2 line chart gives "geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?"
它建议将所有变量设为数字。我的句点变量是一个字符,我想保持这种状态,但即使我将它转换为数字,它仍然给我错误。
如有任何帮助,我们将不胜感激。谢谢!
而不是 geom_line
你想要的是 geom_hline
。特别是,将 geom_line
替换为
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.1, 0.2)),
geom = "hline", aes(yintercept = ..y..))
给予
的确如此
quantile(eg_data$max_sales, c(0.1, 0.2))
# 10% 20%
# 2 3
它还消除了您收到的警告。
我不知道这是否是最优雅的解决方案,但您始终可以在其他地方计算摘要统计信息并将其放在图中。这也可以更好地控制正在发生的事情(根据我的口味)
hline_coordinates= data.frame(Quantile_Name=names(summary(eg_data$max_sales)),
quantile_values=as.numeric(summary(eg_data$max_sales)))
jitter <- (
(ggplot(data = eg_data, aes(x=period, y=max_sales)) + #removed group=1
geom_jitter(stat = "identity", width = .15, color = "blue", alpha = .4)) +
scale_y_continuous(breaks= seq(0,12, by=1)) +
geom_hline(data=hline_coordinates,aes(yintercept=quantile_values)) +
ggtitle("Distribution of Sales by Period") + xlab("Period") + ylab("Sales") +
theme(plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5),
axis.title.x = element_text(color = "black", size = 12, face = "bold"),
axis.title.y = element_text(color = "black", size = 12, face = "bold")) +
labs(fill = "Period") )
jitter
我有下面的数据
eg_data <- data.frame(
period = c(sample( c("1 + 2"), 1000, replace = TRUE)),
max_sales = c(sample( c(1:10), 1000, replace = TRUE, prob =
c(.05, .10, .15, .25, .25, .10, .05, .02, .02, .01)))
我想制作一个 scatter
(实际上是 jitter
)绘图并在 y 轴的不同点添加水平线。我希望能够自定义添加线条的百分位数,但目前,R 的汇总函数之类的功能就可以正常工作。
summary(eg_data$max_sales)
我在下面有抖动图的代码。它运行并生成图表,但我不断收到错误消息:
Each group consists of only one observation. Do you need to adjust the group aesthetic?
jitter <- (
(ggplot(data = eg_data, aes(x=period, y=max_sales, group = 1)) +
geom_jitter(stat = "identity", width = .15, color = "blue", alpha = .4)) +
scale_y_continuous(breaks= seq(0,12, by=1)) +
geom_line(stat = 'summary', fun.y = "quantile", fun.args=list(probs=0.1)) +
ggtitle("Distribution of Sales by Period") + xlab("Period") + ylab("Sales") +
theme(plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5),
axis.title.x = element_text(color = "black", size = 12, face = "bold"),
axis.title.y = element_text(color = "black", size = 12, face = "bold")) +
labs(fill = "Period") )
jitter
我试着看这个问题 -
ggplot2 line chart gives "geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?"
它建议将所有变量设为数字。我的句点变量是一个字符,我想保持这种状态,但即使我将它转换为数字,它仍然给我错误。
如有任何帮助,我们将不胜感激。谢谢!
而不是 geom_line
你想要的是 geom_hline
。特别是,将 geom_line
替换为
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.1, 0.2)),
geom = "hline", aes(yintercept = ..y..))
给予
的确如此
quantile(eg_data$max_sales, c(0.1, 0.2))
# 10% 20%
# 2 3
它还消除了您收到的警告。
我不知道这是否是最优雅的解决方案,但您始终可以在其他地方计算摘要统计信息并将其放在图中。这也可以更好地控制正在发生的事情(根据我的口味)
hline_coordinates= data.frame(Quantile_Name=names(summary(eg_data$max_sales)),
quantile_values=as.numeric(summary(eg_data$max_sales)))
jitter <- (
(ggplot(data = eg_data, aes(x=period, y=max_sales)) + #removed group=1
geom_jitter(stat = "identity", width = .15, color = "blue", alpha = .4)) +
scale_y_continuous(breaks= seq(0,12, by=1)) +
geom_hline(data=hline_coordinates,aes(yintercept=quantile_values)) +
ggtitle("Distribution of Sales by Period") + xlab("Period") + ylab("Sales") +
theme(plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5),
axis.title.x = element_text(color = "black", size = 12, face = "bold"),
axis.title.y = element_text(color = "black", size = 12, face = "bold")) +
labs(fill = "Period") )
jitter