将分布图的最后一个条定义为多于之前的所有值
Define last bar of distribution plot as more than all values before
我想画出玩家总胜场数的分布图。我想将 x 轴的最后一部分作为 "more than the values before" 类别。
示例数据:
game_data <- data.frame(player = c(1,2,3,4,5, 6), n_wins = c(1,8,2,3,6,4))
game_data
player n_wins
1 1 1
2 2 8
3 3 2
4 4 3
5 5 6
6 6 4
6 6 4
下面的代码创建了一个类别"NA",但我希望它是 5+(= 超过 5 次获胜)。
game_data %>% group_by(player) %>% summarise(allwins = sum(n_wins)) %>%
ggplot(aes(x = cut(allwins, breaks = seq(1,6, by = 1)), include.lowest=TRUE)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
scale_y_continuous(labels=scales::percent) +
labs(title="Distribution of Wins", subtitle="", y="Fraction of Players", x="Number of Wins")
我不仅要更改标签,我还希望它自动创建最后一个类别。
您可以通过包含 +Inf 作为中断来执行以下操作,请注意您没有值 5,因此您需要使用 scale_x_discrete:
添加 drop=FALSE
set.seed(100)
game_data <- data.frame(player = c(1,2,3,4,5, 6), n_wins = c(1,8,2,3,6,4))
BR = c(0:5,+Inf)
game_data %>%
group_by(player) %>% summarise(allwins = sum(n_wins)) %>%
ggplot(aes(x = cut(allwins, breaks = BR,labels=c(1:5,"5+")))) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
scale_y_continuous(labels=scales::percent) +
labs(title="Distribution of Wins", subtitle="",
y="Fraction of Players", x="Number of Wins")+
scale_x_discrete(drop=FALSE)
也许是个小意见,为什么要汇总数据?
我想画出玩家总胜场数的分布图。我想将 x 轴的最后一部分作为 "more than the values before" 类别。
示例数据:
game_data <- data.frame(player = c(1,2,3,4,5, 6), n_wins = c(1,8,2,3,6,4))
game_data
player n_wins
1 1 1
2 2 8
3 3 2
4 4 3
5 5 6
6 6 4
6 6 4
下面的代码创建了一个类别"NA",但我希望它是 5+(= 超过 5 次获胜)。
game_data %>% group_by(player) %>% summarise(allwins = sum(n_wins)) %>%
ggplot(aes(x = cut(allwins, breaks = seq(1,6, by = 1)), include.lowest=TRUE)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
scale_y_continuous(labels=scales::percent) +
labs(title="Distribution of Wins", subtitle="", y="Fraction of Players", x="Number of Wins")
我不仅要更改标签,我还希望它自动创建最后一个类别。
您可以通过包含 +Inf 作为中断来执行以下操作,请注意您没有值 5,因此您需要使用 scale_x_discrete:
添加 drop=FALSEset.seed(100)
game_data <- data.frame(player = c(1,2,3,4,5, 6), n_wins = c(1,8,2,3,6,4))
BR = c(0:5,+Inf)
game_data %>%
group_by(player) %>% summarise(allwins = sum(n_wins)) %>%
ggplot(aes(x = cut(allwins, breaks = BR,labels=c(1:5,"5+")))) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
scale_y_continuous(labels=scales::percent) +
labs(title="Distribution of Wins", subtitle="",
y="Fraction of Players", x="Number of Wins")+
scale_x_discrete(drop=FALSE)
也许是个小意见,为什么要汇总数据?