如何在 R 中的 ggplot2 中向条形图方面添加百分比?

How to add percentages to bar chart facets in ggplot2 in R?

我想使用 ggplot2 绘制图表,以便有条形图显示每种 urban/rural 环境(方面)中的人的度数(条)。我做到了。

现在我想为每个方面添加具有每种资格的人的比例。我使用下面的代码得到的是整个人口.

的百分比

如何更改代码以便在每个方面内计算百分比?

这是我使用的数据集中的 1,000 行示例:link

library(ggplot2)
library(scales)

# plot urban/rural by degree in facets
 myplot <- ggplot(data = si
                     ,aes(DEGREE)
    ) 
    myplot <- myplot + geom_bar()
    myplot <- myplot + labs(title = "Degree by Urban/Rural", y = "Percent", x = "DEGREE")
    myplot <- myplot + geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25)
    myplot <- myplot + facet_wrap(~URBRURAL)
    myplot <- myplot + theme(axis.text.x = element_text(angle = 20, hjust = 1))
    myplot

我认为这个作品:

si <- read.csv('sampledata.csv', sep=' ')
myplot <- ggplot(data = si
                 ,aes(DEGREE)
) 
myplot <- myplot + geom_bar()
myplot <- myplot + labs(title = "Degree by Urban/Rural", y = "Percent", x = "DEGREE")
myplot <- myplot +  geom_text(aes(y = ((..count..)/tapply(..count..,..PANEL..,sum)[..PANEL..]), label = scales::percent((..count..)/tapply(..count..,..PANEL..,sum)[..PANEL..])), stat = "count", vjust = -0.25)
myplot <- myplot + facet_wrap(~URBRURAL)
myplot <- myplot + theme(axis.text.x = element_text(angle = 20, hjust = 1))
myplot

实际上 y 轴标签不是百分比而是实际计数,因为它们在原始图中,条形图上的标签代表百分比,请看下面的第 18 行,这表明 45 不是百分比而是实际计数该组在您提供的样本数据中,而相应方面同一栏上的 15.7% 表示百分比。

library(dplyr)
as.data.frame(si %>% group_by(URBRURAL, DEGREE) %>% summarise(n=n()))

1  Country village, other type of community Above higher secondary level, other qualification  6
2  Country village, other type of community                        Above lowest qualification 16
3  Country village, other type of community                        Higher secondary completed  9
4  Country village, other type of community                       Lowest formal qualification 31
5  Country village, other type of community                           No formal qualification 20
6  Country village, other type of community                       University degree completed  1
7               Farm or home in the country                        Above lowest qualification  1
8               Farm or home in the country                        Higher secondary completed  1
9               Farm or home in the country                       Lowest formal qualification  5
10              Farm or home in the country                           No formal qualification  1
11              Farm or home in the country                       University degree completed  1
12           Suburb, outskirt of a big city Above higher secondary level, other qualification 45
13           Suburb, outskirt of a big city                        Above lowest qualification 57
14           Suburb, outskirt of a big city                        Higher secondary completed 75
15           Suburb, outskirt of a big city                       Lowest formal qualification 48
16           Suburb, outskirt of a big city                           No formal qualification 23
17           Suburb, outskirt of a big city                       University degree completed 15
18                       Town or small city Above higher secondary level, other qualification 45

在绘制数据之前,您始终可以转换数据以计算出您想要的结果。我还添加了一些调整(栏顶部的标签、x 轴上的字符串环绕、轴限制和标签)。

library(dplyr)
library(ggplot2)
library(stringr)

plot_data <- df %>% 
  group_by(URBRURAL, DEGREE) %>% 
  tally %>% 
  mutate(percent = n/sum(n))

ggplot(plot_data, aes(x = DEGREE, y = percent)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = percent(percent)), vjust = -0.5) +
  labs(title = "Degree by Urban/Rural", y = "Percent", x = "DEGREE") +
  scale_y_continuous(labels = percent, limits = c(0,1)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 10)) +
  facet_wrap(~URBRURAL)