R 堆叠百分比频率直方图,基于聚合数据的百分比

R stacked % frequency histogram with percentage of aggregated data based on

我相信我的问题与 . Only difference is my aes fill is a factor with multiple levels. This what I am after

非常相似

这就是我的进展

set.seed(123)
n = 100

LoanStatus = sample(c('Chargedoff', 'Completed', 'Current', 'Defaulted', 'PastDue'), n, replace = T, prob = NULL)
ProsperScore = sample(1:11, n, replace = T, prob = NULL)

df = data.frame(ProsperScore,factor(LoanStatus))
df = data.frame(ProsperScore,LoanStatus)

probs = data.frame(prop.table(table(df),1))

堆叠条形图的代码可能如下所示:

library(ggplot2)

brks <- c(0, 0.25, 0.5, 0.75, 1)

ggplot(data=probs,aes(x=ProsperScore,y=Freq,fill=LoanStatus)) +
  geom_bar(stat="identity") +
  scale_y_continuous(breaks = brks, labels = scales::percent(brks)) +
  scale_x_discrete(breaks = c(3,6,9))

这里有更完整的代码,演示了如何向图中添加百分比:

library(ggplot2)
library(plyr)

brks <- c(0, 0.25, 0.5, 0.75, 1)

probs <- probs %>% dplyr::group_by(ProsperScore) %>%
  dplyr::mutate(pos=cumsum(Freq)-(Freq*0.5)) %>%
  dplyr::mutate(pos=ifelse(Freq==0,NA,pos))

probs$LoanStatus <- factor(probs$LoanStatus, levels = rev(levels(probs$LoanStatus))) 

ggplot(data=probs,aes(x=ProsperScore,y=Freq,fill=LoanStatus)) +
  geom_bar(stat="identity") +
  scale_y_continuous(breaks = brks, labels = scales::percent(brks)) +
  scale_x_discrete(breaks = c(3,6,9)) +
  geom_text(data=probs, aes(x = ProsperScore, y = pos,
                                  label = paste0(round(100*Freq),"%")), size=2)

要仅在图表的第一列中显示百分比,请将 %>% dplyr::mutate(pos=ifelse(ProsperScore==1,pos,NA)) 添加到 dplyr 调用。