为什么将 `position = "dodge"` 添加到我的 `geom_bar` 导致值显示不正确?

Why is adding `position = "dodge"` to my `geom_bar` leading to values being displayed incorrectly?

我有一个数据框:

df <- data.frame(human = c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5),
                 stage = c("A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4", "A1", "A2", "A3", "A4"),
                 class = c(0,1,0,0,0,1,0,1,1,1,0,1,0,0,0,1,0,1,1,1,0,1,0,0,0,1,0,1,1,1,0,1,0,0,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0)
)

并且想要 x 轴上每个阶段的条形图:

ggplot(df, aes(x = stage, y = class, fill = as.factor(human))) + geom_bar(stat = "identity") + scale_y_continuous(limits = c(0,15))

看起来不错,但我想把人的元素放在一起,所以我加了position = "dodge":

ggplot(df, aes(x = stage, y = class, fill = as.factor(human))) + geom_bar(stat = "identity", position= "dodge") + scale_y_continuous(limits = c(0,15))

虽然列现在是并排的,但出于某种原因,所有 class = 1:

因为您使用stat = "identity"。所以那你得先数一下。

library(tidyverse)
df %>%
  count(stage, class, human) %>%
  ggplot(aes(x = stage, y = n, fill = as.factor(human))) + 
  geom_bar(stat = "identity", position = "dodge")

这是因为您的 "identies" 是 0 或 1。处理此问题的一种方法是在绘制之前 summarize 您的数据。例如:

library(tidyverse)

df %>% 
    group_by(human, stage) %>% 
    summarise(class = sum(class)) %>% 
    ggplot(aes(x = stage, y = class, fill = as.factor(human))) + 
    geom_bar(stat = "identity", position= "dodge")

避免dplyr预处理使用stat_summary的解决方案:

ggplot(df, aes(x = stage, 
               y = class, 
               fill = as.factor(human))) + 
  stat_summary(geom = "bar", 
               position = "dodge", 
               fun.y = "sum")