ggplot 中的降序 bar_col

Descending order in ggplot bar_col

下面是数据集,

# A tibble: 449 x 7
   `Country or Area` `Region 1`      Year   Rate MinCI MaxCI Average
   <chr>             <chr>           <chr> <dbl> <dbl> <dbl>   <dbl>
 1 Afghanistan       Southern Asia   2011    4.2   2.6   6.2    4.4 
 2 Afghanistan       Southern Asia   2016    5.5   3.4   8.1    5.75
 3 Aland Islands     Northern Europe NA     NA    NA    NA     NA   
 4 Albania           Southern Europe 2011   18.8  14.8  23     18.9 
 5 Albania           Southern Europe 2016   21.7  17    26.7   21.8 
 6 Algeria           Northern Africa 2011   24    19.9  28.4   24.2 
 7 Algeria           Northern Africa 2016   27.4  22.5  32.7   27.6 
 8 American Samoa    Polynesia       NA     NA    NA    NA     NA   
 9 Andorra           Southern Europe 2011   24.6  19.8  29.8   24.8 
10 Andorra           Southern Europe 2016   25.6  20.1  31.3   25.7

我需要用上面的数据集画一个bar_col来比较每个地区的平均肥胖率。此外,我需要从最高到最低对栏进行排序。

我也计算了上面显示的平均肥胖率。

下面是我用来生成 ggplot 的代码,但无法弄清楚如何从高到低排序。

region_plot <- ggplot(continent) + aes(x = continent$`Region 1`, y = continent$Average, fill = Average) +
  geom_col() +
  xlab("Region") + ylab("Average Obesity") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  ggtitle("Average obesity rate of each region")
region_plot

检查数据后,您有多个区域,因此为了显示每个区域的平均值,您必须对其进行计算然后绘制。您可以使用 dplyr 使用 group_by()summarise() 来做到这一点。您的数据有限,但对于真实数据,NA 不应存在。这里的代码使用了部分共享数据。使用真实数据时请注意名称。 reorder() 函数可以安排酒吧。这里的代码:

library(dplyr)
library(ggplot2)
#Code
df %>% group_by(Region) %>%
  summarise(Avg=mean(Average,na.rm=T)) %>%
  filter(!is.na(Avg)) %>%
  ggplot(aes(x=reorder(Region,-Avg),y=Avg,fill=Region))+
  geom_col() +
  xlab("Region") + ylab("Average Obesity") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  ggtitle("Average obesity rate of each region")

输出:

使用了一些数据:

#Data
df <- structure(list(Region = c("Southern Asia", "Southern Asia", "Northern Europe", 
"Southern Europe", "Southern Europe", "Northern Africa", "Northern Africa", 
"Polynesia", "Southern Europe", "Southern Europe"), Year = c(2011L, 
2016L, NA, 2011L, 2016L, 2011L, 2016L, NA, 2011L, 2016L), Rate = c(4.2, 
5.5, NA, 18.8, 21.7, 24, 27.4, NA, 24.6, 25.6), MinCI = c(2.6, 
3.4, NA, 14.8, 17, 19.9, 22.5, NA, 19.8, 20.1), MaxCI = c(6.2, 
8.1, NA, 23, 26.7, 28.4, 32.7, NA, 29.8, 31.3), Average = c(4.4, 
5.75, NA, 18.9, 21.8, 24.2, 27.6, NA, 24.8, 25.7)), row.names = c(NA, 
-10L), class = "data.frame")

可以通过预处理数据并按Average对结果进行排序来解决问题。然后强制 Region 1 到 factor.

library(ggplot2)
library(dplyr)

continent %>%
  group_by(`Region 1`) %>%
  summarise(Average = mean(Average, na.rm = TRUE)) %>%
  arrange(desc(Average)) %>% 
  mutate(`Region 1` = factor(`Region 1`, levels = unique(`Region 1`))) %>%
  ggplot(aes(x = `Region 1`, y = Average, fill = Average)) +
  geom_col() +
  xlab("Region") + ylab("Average Obesity") +
  ggtitle("Average obesity rate of each region") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) -> region_plot

region_plot

数据

continent <- read.table(text = "
 'Country or Area' 'Region 1'      Year   Rate MinCI MaxCI Average
 1 Afghanistan       'Southern Asia'   2011    4.2   2.6   6.2    4.4 
 2 Afghanistan       'Southern Asia'   2016    5.5   3.4   8.1    5.75
 3 'Aland Islands'     'Northern Europe' NA     NA    NA    NA     NA   
 4 Albania           'Southern Europe' 2011   18.8  14.8  23     18.9 
 5 Albania           'Southern Europe' 2016   21.7  17    26.7   21.8 
 6 Algeria           'Northern Africa' 2011   24    19.9  28.4   24.2 
 7 Algeria           'Northern Africa' 2016   27.4  22.5  32.7   27.6 
 8 'American Samoa'    Polynesia       NA     NA    NA    NA     NA   
 9 Andorra           'Southern Europe' 2011   24.6  19.8  29.8   24.8 
10 Andorra           'Southern Europe' 2016   25.6  20.1  31.3   25.7
", header = TRUE, check.names = FALSE)