R 中汇总和分组数据的箱线图

Boxplot with Summarized and Grouped Data in R

我有以下预先汇总的费用数据:

MeanCost Std MedianCost LowerIQR UpperIQR StatusGroup AgeGroup
700 500 650 510 780 Dead Young
800 600 810 666 1000 Alive Young
500 200 657 450 890 Comatose Young
300 400 560 467 670 Dead Old
570 600 500 450 600 Alive Old
555 500 677 475 780 Comatose Old
333 455 300 200 400 Dead Middle
678 256 600 445 787 Alive Middle
1500 877 980 870 1200 Comatose Middle

我想用这些信息创建一个箱线图 - 类似于下图。 其中每种颜色代表状态组(蓝色=死亡,阅读=活着,绿色=昏迷)。 每个分组的集群代表一个年龄组(左集群=年轻,中间集群=中间,右集群=老)。

我知道我没有最小值和最大值,所以不需要胡须。
我想用 R 编写代码,如有任何帮助,我们将不胜感激!谢谢。

这是我试过的代码:

 dattest<- data.frame(
  Mean_Cost = c(700,800,500,300,570,555,333,678,1500), 
  Std = c(500,600,200,400,600,500,455,256,877), 
  Median_Cost = c(650,810,657,560,500,677,300,600,980), 
  LowerIQR = c(510,666,450,467,450,475,200,445,870), 
  UpperIQR = c(780,1000,890,670,600,780,400,787,1200), 
  StatusGroup = c(1,2,3,1,2,3,1,2,3),
  AgeGroup = c(1,1,1,2,2,2,3,3,3))

其中 StatusGroup 1=dead,2=alive,3-comatose
对于年龄组 1=年轻,2=老年,3=中等

 ggplot(dattest, aes(xmin = AgeGroup-.25, xmax=AgeGroup+.25, ymin=LowerIQR, ymax=UpperIQR)) + 
    geom_rect(fill="transparent", col = "blue") + 
    geom_segment(aes(y=Median_Cost, yend=Median_Cost, x=AgeGroup-.25, xend=AgeGroup+.25), col="blue") + 
    geom_point(mapping=aes(x = StatusGroup, y = Mean_Cost), col="red") +
    scale_x_continuous(breaks=1:3, labels=c("Young","Old","Middle")) + 
    theme_classic()

而且这段代码绝对不是我想要的

这是你想要做的吗?

library(tidyverse)
df <- tibble::tribble(
  ~MeanCost, ~Std, ~MedianCost, ~LowerIQR, ~UpperIQR, ~StatusGroup, ~AgeGroup,
       700L, 500L,        650L,      510L,      780L,       "Dead",   "Young",
       800L, 600L,        810L,      666L,     1000L,      "Alive",   "Young",
       500L, 200L,        657L,      450L,      890L,   "Comatose",   "Young",
       300L, 400L,        560L,      467L,      670L,       "Dead",     "Old",
       570L, 600L,        500L,      450L,      600L,      "Alive",     "Old",
       555L, 500L,        677L,      475L,      780L,   "Comatose",     "Old",
       333L, 455L,        300L,      200L,      400L,       "Dead",  "Middle",
       678L, 256L,        600L,      445L,      787L,      "Alive",  "Middle",
      1500L, 877L,        980L,      870L,     1200L,   "Comatose",  "Middle"
  )

df %>% 
  mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>% 
  ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
  geom_boxplot(aes(
    lower = LowerIQR, 
    upper = UpperIQR, 
    middle = MedianCost, 
    ymin = MedianCost - Std, 
    ymax = MedianCost + Std),
    stat = "identity", width = 0.5)

编辑

要在平均值处添加一个“x”,您可以调整位置:

df %>% 
  mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>% 
  ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
  geom_boxplot(aes(
    lower = LowerIQR, 
    upper = UpperIQR, 
    middle = MedianCost, 
    ymin = MedianCost - Std, 
    ymax = MedianCost + Std),
    stat = "identity", width = 0.5) +
  geom_point(aes(y = MeanCost),
             position = position_dodge(width = 0.5),
             shape = 4)