使用 ggplot2 基于两个因子变量(在 x 轴上)排序箱线图
ordering boxplots based on two factor variables (in x-axis) with ggplot2
我有一个包含 3 个变量的大型数据框:symbols、vaf、Gene.function。 (link 到 df:https://www.dropbox.com/s/y6ykbzuy8x19psp/df_SO.txt?dl=0)。
dim(df)
[1] 2021 3
我正在尝试创建一个包含多个箱线图的图形,并根据变量“Gene.function”和 x 轴为“符号”来排列它们。
我不关心 x 轴上的顺序,但我确实希望所有具有相同类别的基因(符号)一个接一个,就像这里的例子一样:
我最接近实现目标的方法是使用 forcast 库,但出于某种原因,并非所有具有相同“Gene.function”的基因都被排序在一起。这是我使用的代码:
B <- df %>%
mutate(symbol = fct_reorder(symbol, Gene.function)) %>%
ggplot(aes(x = factor(symbol), y = vaf, fill = factor(Gene.function), color = factor(Gene.function))) +
geom_boxplot() +
scale_y_continuous(labels = function(x) paste0(x * 100, '%')) +
xlab('') +
ylab('') +
ggtitle ('VAF distribution')+
guides(fill = 'none')+
theme_classic() +
theme(legend.position = "right",
axis.text.x = element_text(angle = 90, size = 10, hj = 0.5, vj = 0.5, color = "black"),
axis.text.y = element_text(size = 8, color = "black"),
axis.title = element_text(size = 12),
plot.title = element_text(size = 14, face = 'italic'))
我认为问题在于我使用的变量 none 是数字,相反,两者都是因子(符号和 Gene.functions)。事实上,当 运行 上面的代码时,我收到以下警告:
There were 24 warnings (use warnings() to see them)
Warning messages:
1: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
2: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
3: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
4: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
(...)
有人可以给我提示吗?非常感谢!
需要先根据Gene.function
和symbol
对df
进行排序,然后根据symbol
的排序信息进行制作正确的因子水平顺序:
library(ggplot2)
library(dplyr)
level_info <- df %>%
arrange(Gene.function, symbol) %>%
pull(symbol) %>%
unique()
df %>%
mutate(Gene.function = as.factor(Gene.function),
symbol = factor(symbol, levels = level_info)) %>%
ggplot(aes(x = symbol, y = vaf, fill = Gene.function, color = Gene.function)) +
geom_boxplot() +
scale_y_continuous(labels = function(x) paste0(x * 100, '%')) +
xlab('') +
ylab('') +
ggtitle ('VAF distribution')+
guides(fill = 'none')+
theme_classic() +
theme(legend.position = "right",
axis.text.x = element_text(angle = 90, size = 10, hj = 0.5, vj = 0.5, color = "black"),
axis.text.y = element_text(size = 8, color = "black"),
axis.title = element_text(size = 12),
plot.title = element_text(size = 14, face = 'italic'))
我有一个包含 3 个变量的大型数据框:symbols、vaf、Gene.function。 (link 到 df:https://www.dropbox.com/s/y6ykbzuy8x19psp/df_SO.txt?dl=0)。
dim(df)
[1] 2021 3
我正在尝试创建一个包含多个箱线图的图形,并根据变量“Gene.function”和 x 轴为“符号”来排列它们。 我不关心 x 轴上的顺序,但我确实希望所有具有相同类别的基因(符号)一个接一个,就像这里的例子一样:
我最接近实现目标的方法是使用 forcast 库,但出于某种原因,并非所有具有相同“Gene.function”的基因都被排序在一起。这是我使用的代码:
B <- df %>%
mutate(symbol = fct_reorder(symbol, Gene.function)) %>%
ggplot(aes(x = factor(symbol), y = vaf, fill = factor(Gene.function), color = factor(Gene.function))) +
geom_boxplot() +
scale_y_continuous(labels = function(x) paste0(x * 100, '%')) +
xlab('') +
ylab('') +
ggtitle ('VAF distribution')+
guides(fill = 'none')+
theme_classic() +
theme(legend.position = "right",
axis.text.x = element_text(angle = 90, size = 10, hj = 0.5, vj = 0.5, color = "black"),
axis.text.y = element_text(size = 8, color = "black"),
axis.title = element_text(size = 12),
plot.title = element_text(size = 14, face = 'italic'))
我认为问题在于我使用的变量 none 是数字,相反,两者都是因子(符号和 Gene.functions)。事实上,当 运行 上面的代码时,我收到以下警告:
There were 24 warnings (use warnings() to see them)
Warning messages:
1: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
2: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
3: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
4: Problem with `mutate()` input `symbol`.
i argument is not numeric or logical: returning NA
i Input `symbol` is `fct_reorder(symbol, Gene.function)`.
(...)
有人可以给我提示吗?非常感谢!
需要先根据Gene.function
和symbol
对df
进行排序,然后根据symbol
的排序信息进行制作正确的因子水平顺序:
library(ggplot2)
library(dplyr)
level_info <- df %>%
arrange(Gene.function, symbol) %>%
pull(symbol) %>%
unique()
df %>%
mutate(Gene.function = as.factor(Gene.function),
symbol = factor(symbol, levels = level_info)) %>%
ggplot(aes(x = symbol, y = vaf, fill = Gene.function, color = Gene.function)) +
geom_boxplot() +
scale_y_continuous(labels = function(x) paste0(x * 100, '%')) +
xlab('') +
ylab('') +
ggtitle ('VAF distribution')+
guides(fill = 'none')+
theme_classic() +
theme(legend.position = "right",
axis.text.x = element_text(angle = 90, size = 10, hj = 0.5, vj = 0.5, color = "black"),
axis.text.y = element_text(size = 8, color = "black"),
axis.title = element_text(size = 12),
plot.title = element_text(size = 14, face = 'italic'))