R 中汇总和分组数据的箱线图
Boxplot with Summarized and Grouped Data in R
我有以下预先汇总的费用数据:
MeanCost
Std
MedianCost
LowerIQR
UpperIQR
StatusGroup
AgeGroup
700
500
650
510
780
Dead
Young
800
600
810
666
1000
Alive
Young
500
200
657
450
890
Comatose
Young
300
400
560
467
670
Dead
Old
570
600
500
450
600
Alive
Old
555
500
677
475
780
Comatose
Old
333
455
300
200
400
Dead
Middle
678
256
600
445
787
Alive
Middle
1500
877
980
870
1200
Comatose
Middle
我想用这些信息创建一个箱线图 - 类似于下图。
其中每种颜色代表状态组(蓝色=死亡,阅读=活着,绿色=昏迷)。
每个分组的集群代表一个年龄组(左集群=年轻,中间集群=中间,右集群=老)。
我知道我没有最小值和最大值,所以不需要胡须。
我想用 R 编写代码,如有任何帮助,我们将不胜感激!谢谢。
这是我试过的代码:
dattest<- data.frame(
Mean_Cost = c(700,800,500,300,570,555,333,678,1500),
Std = c(500,600,200,400,600,500,455,256,877),
Median_Cost = c(650,810,657,560,500,677,300,600,980),
LowerIQR = c(510,666,450,467,450,475,200,445,870),
UpperIQR = c(780,1000,890,670,600,780,400,787,1200),
StatusGroup = c(1,2,3,1,2,3,1,2,3),
AgeGroup = c(1,1,1,2,2,2,3,3,3))
其中 StatusGroup 1=dead,2=alive,3-comatose
对于年龄组 1=年轻,2=老年,3=中等
ggplot(dattest, aes(xmin = AgeGroup-.25, xmax=AgeGroup+.25, ymin=LowerIQR, ymax=UpperIQR)) +
geom_rect(fill="transparent", col = "blue") +
geom_segment(aes(y=Median_Cost, yend=Median_Cost, x=AgeGroup-.25, xend=AgeGroup+.25), col="blue") +
geom_point(mapping=aes(x = StatusGroup, y = Mean_Cost), col="red") +
scale_x_continuous(breaks=1:3, labels=c("Young","Old","Middle")) +
theme_classic()
而且这段代码绝对不是我想要的
这是你想要做的吗?
library(tidyverse)
df <- tibble::tribble(
~MeanCost, ~Std, ~MedianCost, ~LowerIQR, ~UpperIQR, ~StatusGroup, ~AgeGroup,
700L, 500L, 650L, 510L, 780L, "Dead", "Young",
800L, 600L, 810L, 666L, 1000L, "Alive", "Young",
500L, 200L, 657L, 450L, 890L, "Comatose", "Young",
300L, 400L, 560L, 467L, 670L, "Dead", "Old",
570L, 600L, 500L, 450L, 600L, "Alive", "Old",
555L, 500L, 677L, 475L, 780L, "Comatose", "Old",
333L, 455L, 300L, 200L, 400L, "Dead", "Middle",
678L, 256L, 600L, 445L, 787L, "Alive", "Middle",
1500L, 877L, 980L, 870L, 1200L, "Comatose", "Middle"
)
df %>%
mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>%
ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
geom_boxplot(aes(
lower = LowerIQR,
upper = UpperIQR,
middle = MedianCost,
ymin = MedianCost - Std,
ymax = MedianCost + Std),
stat = "identity", width = 0.5)
编辑
要在平均值处添加一个“x”,您可以调整位置:
df %>%
mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>%
ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
geom_boxplot(aes(
lower = LowerIQR,
upper = UpperIQR,
middle = MedianCost,
ymin = MedianCost - Std,
ymax = MedianCost + Std),
stat = "identity", width = 0.5) +
geom_point(aes(y = MeanCost),
position = position_dodge(width = 0.5),
shape = 4)
我有以下预先汇总的费用数据:
MeanCost | Std | MedianCost | LowerIQR | UpperIQR | StatusGroup | AgeGroup |
---|---|---|---|---|---|---|
700 | 500 | 650 | 510 | 780 | Dead | Young |
800 | 600 | 810 | 666 | 1000 | Alive | Young |
500 | 200 | 657 | 450 | 890 | Comatose | Young |
300 | 400 | 560 | 467 | 670 | Dead | Old |
570 | 600 | 500 | 450 | 600 | Alive | Old |
555 | 500 | 677 | 475 | 780 | Comatose | Old |
333 | 455 | 300 | 200 | 400 | Dead | Middle |
678 | 256 | 600 | 445 | 787 | Alive | Middle |
1500 | 877 | 980 | 870 | 1200 | Comatose | Middle |
我想用这些信息创建一个箱线图 - 类似于下图。
其中每种颜色代表状态组(蓝色=死亡,阅读=活着,绿色=昏迷)。
每个分组的集群代表一个年龄组(左集群=年轻,中间集群=中间,右集群=老)。
我知道我没有最小值和最大值,所以不需要胡须。
我想用 R 编写代码,如有任何帮助,我们将不胜感激!谢谢。
这是我试过的代码:
dattest<- data.frame(
Mean_Cost = c(700,800,500,300,570,555,333,678,1500),
Std = c(500,600,200,400,600,500,455,256,877),
Median_Cost = c(650,810,657,560,500,677,300,600,980),
LowerIQR = c(510,666,450,467,450,475,200,445,870),
UpperIQR = c(780,1000,890,670,600,780,400,787,1200),
StatusGroup = c(1,2,3,1,2,3,1,2,3),
AgeGroup = c(1,1,1,2,2,2,3,3,3))
其中 StatusGroup 1=dead,2=alive,3-comatose
对于年龄组 1=年轻,2=老年,3=中等
ggplot(dattest, aes(xmin = AgeGroup-.25, xmax=AgeGroup+.25, ymin=LowerIQR, ymax=UpperIQR)) +
geom_rect(fill="transparent", col = "blue") +
geom_segment(aes(y=Median_Cost, yend=Median_Cost, x=AgeGroup-.25, xend=AgeGroup+.25), col="blue") +
geom_point(mapping=aes(x = StatusGroup, y = Mean_Cost), col="red") +
scale_x_continuous(breaks=1:3, labels=c("Young","Old","Middle")) +
theme_classic()
而且这段代码绝对不是我想要的
这是你想要做的吗?
library(tidyverse)
df <- tibble::tribble(
~MeanCost, ~Std, ~MedianCost, ~LowerIQR, ~UpperIQR, ~StatusGroup, ~AgeGroup,
700L, 500L, 650L, 510L, 780L, "Dead", "Young",
800L, 600L, 810L, 666L, 1000L, "Alive", "Young",
500L, 200L, 657L, 450L, 890L, "Comatose", "Young",
300L, 400L, 560L, 467L, 670L, "Dead", "Old",
570L, 600L, 500L, 450L, 600L, "Alive", "Old",
555L, 500L, 677L, 475L, 780L, "Comatose", "Old",
333L, 455L, 300L, 200L, 400L, "Dead", "Middle",
678L, 256L, 600L, 445L, 787L, "Alive", "Middle",
1500L, 877L, 980L, 870L, 1200L, "Comatose", "Middle"
)
df %>%
mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>%
ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
geom_boxplot(aes(
lower = LowerIQR,
upper = UpperIQR,
middle = MedianCost,
ymin = MedianCost - Std,
ymax = MedianCost + Std),
stat = "identity", width = 0.5)
编辑
要在平均值处添加一个“x”,您可以调整位置:
df %>%
mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>%
ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
geom_boxplot(aes(
lower = LowerIQR,
upper = UpperIQR,
middle = MedianCost,
ymin = MedianCost - Std,
ymax = MedianCost + Std),
stat = "identity", width = 0.5) +
geom_point(aes(y = MeanCost),
position = position_dodge(width = 0.5),
shape = 4)