将分布图的最后一个条定义为多于之前的所有值

Define last bar of distribution plot as more than all values before

我想画出玩家总胜场数的分布图。我想将 x 轴的最后一部分作为 "more than the values before" 类别。

示例数据:

game_data <- data.frame(player = c(1,2,3,4,5, 6), n_wins = c(1,8,2,3,6,4))

game_data
  player n_wins
1      1      1
2      2      8
3      3      2
4      4      3
5      5      6
6      6      4
6      6      4

下面的代码创建了一个类别"NA",但我希望它是 5+(= 超过 5 次获胜)。

game_data %>% group_by(player) %>% summarise(allwins = sum(n_wins)) %>%
  ggplot(aes(x = cut(allwins, breaks = seq(1,6, by = 1)), include.lowest=TRUE)) + 
  geom_bar(aes(y = (..count..)/sum(..count..))) + 
  scale_y_continuous(labels=scales::percent) +
  labs(title="Distribution of Wins", subtitle="", y="Fraction of Players", x="Number of Wins")

我不仅要更改标签,我还希望它自动创建最后一个类别。

您可以通过包含 +Inf 作为中断来执行以下操作,请注意您没有值 5,因此您需要使用 scale_x_discrete:

添加 drop=FALSE
set.seed(100)
game_data <- data.frame(player = c(1,2,3,4,5, 6), n_wins = c(1,8,2,3,6,4))
BR = c(0:5,+Inf)

game_data %>% 
group_by(player) %>% summarise(allwins = sum(n_wins)) %>%
  ggplot(aes(x = cut(allwins, breaks = BR,labels=c(1:5,"5+")))) + 
  geom_bar(aes(y = (..count..)/sum(..count..))) + 
  scale_y_continuous(labels=scales::percent) +
  labs(title="Distribution of Wins", subtitle="", 
y="Fraction of Players", x="Number of Wins")+
scale_x_discrete(drop=FALSE)

也许是个小意见,为什么要汇总数据?