按变量之间的差异排序条形图

Order bars by difference between variables

我的意图是绘制一个条形图,其中变量可见: "HH_FIN_EX", "ACT_IND_CON_EXP" 但让它们按变量 diff 升序排列。 diff 本身不应包含在图表中

library(eurostat)
library(tidyverse)

#getting the data
data1 <- get_eurostat("nama_10_gdp",time_format = "num")

#filtering
data_1_4 <- data1 %>% 
        filter(time=="2016", 
               na_item %in% c("B1GQ", "P31_S14_S15", "P41"), 
               geo %in% c("BE","BG","CZ","DK","DE","EE","IE","EL","ES","FR","HR","IT","CY","LV","LT","LU","HU","MT","NL","AT","PL","PT","RO","SI","SK","FI","SE","UK"), 
               unit=="CP_MEUR")%>% select(-unit, -time)

#transformations and calculations
data_1_4 <- data_1_4 %>% 
        spread(na_item, values)%>% 
        na.omit() %>% 
        mutate(HH_FIN_EX = P31_S14_S15/B1GQ, ACT_IND_CON_EXP=P41/B1GQ, diff=ACT_IND_CON_EXP-HH_FIN_EX) %>%
        gather(na_item, values,  2:7)%>%
        filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP", "diff")) 
#plotting
ggplot(data=data_1_4, aes(x=reorder(geo, values), y=values, fill=na_item))+
        geom_bar(stat="identity", position=position_dodge(), colour="black")+
        labs(title="", x="Countries", y="As percentage of GDP")

我很感激任何有关如何执行此操作的建议,因为 aes(x=reorder(geo, values[values=="diff"]) 会导致错误。

这是您要找的吗?

data_1_4 %>% mutate(Val = fct_reorder(geo, values, .desc = TRUE)) %>%
      filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP")) %>%
       ggplot(aes(x=Val, y=values, fill=na_item)) +
        geom_bar(stat="identity", position=position_dodge(), colour="black") +
        labs(title="", x="Countries", y="As percentage of GDP")  

您可以明确地确定您想要的顺序——它存储在下面的 country_order 中——并强制因子 geo 以该顺序具有其水平。然后过滤掉diff变量后运行ggplot。因此,将您对 ggplot 的调用替换为以下内容:

country_order = (data_1_4 %>% filter(na_item == 'diff') %>% arrange(values))$geo
data_1_4$geo = factor(data_1_4$geo, country_order)
ggplot(data=filter(data_1_4, na_item != 'diff'), aes(x=geo, y=values, fill=na_item))+
  geom_bar(stat="identity", position=position_dodge(), colour="black")+
  labs(title="", x="Countries", y="As percentage of GDP") 

这样做,我得到了下面的情节:

首先,在使用 gather 时,您不应该包含 diff(您的结果列),这会使事情变得复杂。
将行 gather(na_item, values, 2:7) 更改为 gather(na_item, values, 2:6).

您可以使用此代码按降序计算差异和顺序(使用 dplyr::arange)行:

plotData <- data_1_4 %>% 
        spread(na_item, values) %>% 
        na.omit() %>% 
        mutate(HH_FIN_EX = P31_S14_S15 / B1GQ, 
               ACT_IND_CON_EXP = P41 / B1GQ, 
               diff = ACT_IND_CON_EXP - HH_FIN_EX) %>%
        gather(na_item, values, 2:6) %>%
        filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP")) %>%
        arrange(desc(diff))

并绘制它:

ggplot(plotData, aes(geo, values, fill = na_item))+
    geom_bar(stat = "identity", position = "dodge", color = "black") +
    labs(x = "Countries", 
         y = "As percentage of GDP") +
    scale_x_discrete(limits = plotData$geo)