ggplot 的多组频率

Multigroup frequency with ggplot

我正在尝试 replicate this histogram 在 R.

以下是如何模拟我的数据集:

    dft <- data.frame(
  menutype =  sample(c(1,2,4,5,6,8,12), 120, replace = T),
  Belief = sample(c(0,1), 120, replace = T),
  Choice = sample(c(0,1), 120, replace = T)
)

这是我的代码:

    library(ggplot2)
    library(dplyr)
    library(tidyr)
    library(MASS)


    df <- data.frame(
  menutype =  factor(df$menutype, labels = c("GUILT" , "SSB0", "SSB1", "FLEX0", "FLEX1", "STD", "FLEX01"),
                     levels = c(1,2,4,5,6,8,12)),
  Belief = factor(df$belieflearn, levels = c(1), labels= c("Believe Learn")), #Interested only in this condition
  Choice = factor(df$learned, levels = c(1), labels= c("Learn")) #Same here
)


    df1 <- rbind(na.omit(df %>%
                           count(Belief, menutype) %>%
                           group_by(menutype) %>% 
                           mutate(prop = n / sum(n))),
                 na.omit(df %>%
                           count(Choice, menutype) %>%
                           group_by(menutype) %>% 
                           mutate(prop = n / sum(n))))



    test <- paste(df1$Belief[1:6],paste(df1$Choice[7:13]))
test[1:6] <- paste(df1$Belief[1:6])
test[7:13] <- paste(df1$Choice[7:13])

df1$combine <- paste(test)

    ggplot(data = df1, aes(menutype, prop, fill = combine)) + 
      labs(title = "Classification based on rank ordering\n", x = "", y = "Fraction of subjects", fill = "\n") +
      geom_bar(stat = "identity", position = "dodge")+
      theme_bw() +
      theme(legend.position="bottom", plot.title = element_text(hjust = 0.5)) #Centering of the main title+
    #geom_text(aes(label="ok"), vjust=-0.3, size=3.5)+

问题是它或多或少能正常工作,我几乎 getting the graph 我想要的,但这是一种解决方法,但仍然存在一些错误。事实上,例如,我为 STD (0.10) 设置了相同的值,但它应该像原始图表中那样为 0 和 0.10。

我最想做的是拥有两个不同的数据帧,一个是 menutypeBelief,另一个是 menutypeChoice,然后正如我所做的那样,计算特定模态在 menutype 上每个后一个变量中的比例,最后将其绘制为直方图,就像原始研究中的 the graph 一样。此外,我希望将比例作为每个条形上方的分数,但这是可选的。

有人可以帮我解决这个问题吗?我真的很难让它工作。

提前致谢!

编辑:我认为问题出在fill =。我想为每个柱指定我想要的变量(例如,fill = df2$Belief & df2$Choice),但我不知道如何进行。

library(tidyverse)

set.seed(10)

# example data frame
df <- data.frame(
  menutype =  sample(c(1,2,4,5,6,8,12), 120, replace = T),
  Belief = sample(c(0,1), 120, replace = T),
  Choice = sample(c(0,1), 120, replace = T)
)

# calculate all metrics based on all variables you want to plot in a tidy way
df_plot = df %>%
  group_by(Choice) %>%
  count(menutype, Belief) %>%
  mutate(prop = n / sum(n),
         prop_text = paste0(n, "/", sum(n))) %>%
  ungroup()

# barplots using one variable and split plots using another variable
df_plot %>%
  mutate(Belief = factor(Belief),
         menutype = factor(menutype)) %>%
  ggplot(aes(menutype, prop, fill = Belief))+
  geom_col(position = "dodge")+
  facet_wrap(~Choice, ncol=1)+
  geom_text(aes(label=prop_text), position = position_dodge(1), vjust = -0.5)+
  ylim(0,0.2)