在 R 中使用 ggplot2 绘制箱线图
Boxplot with ggplot2 in R
我有一个名为 cells
的数据框,行中有单元格,列中有样本,这里是一个 dput 样本:
structure(c(8.10937548981953e-20, 0.095381661829093, 0.054868371418562,
0.0523687378840825, 0.0100173293159538, 0.0332395245437795, 3.37811149975583e-20,
0.048191378909587, 0.13314908462763, 0, 0.00612878313809124,
0, 0.00209460699328093, 0.205599458004829, 0.318048653115709,
4.21796249339787e-05, 0.00844407692255898, 0, 0.00613007026042523,
0.0300024082993193, 0.0405191646567986, 0.00654087887823056,
0.0111094954094255, 1.30617589099212e-19, 0.0398730537850546,
0.0390946117756341, 0.239413780024853, 2.07521807718399e-19,
0.00116980239850497, 0, 0, 0.00971921247886335, 0.0588607291731613,
3.8563512241696e-21, 0.00247621905821516, 0), .Dim = c(6L, 6L
), .Dimnames = list(c("Adipocytes", "B-cells", "Basophils", "CD4+ memory T-cells",
"CD4+ naive T-cells", "CD4+ T-cells"), c("Pt1", "Pt10", "Pt103",
"Pt106", "Pt11", "Pt17")))
cells
中的行数为 38,列数为 49。
我要为其构建箱线图的目标单元格类型是:CD4+ memory T-cells
此外,还有另一个名为 Metadata
的数据框,其中包含响应列,如果样本对某种治疗有反应 - Metadata$Benefit
。所以对于 cells
中的每个样本,我可以知道他们是否回应。
目标:用 Response/No 响应在 x 轴上和 y 轴上的值制作一个箱线图,我需要箱线图上面有单独的数据点.
我的代码:
cells %>%
ggplot( aes(x = as.factor(Metadata$Benefit), y = as.numeric(cells['CD4+ memory T-cells',]), fill= c('red','lightblue'))) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=11)
) +
ggtitle("A boxplot with jitter") +
xlab("")
错误:Aesthetics must be either length 1 or the same as the data (38): x, y and fill
我不明白问题是什么,因为 length(as.factor(Metadata$Benefit))
和 length(as.numeric(cells['CD4+ memory T-cells',]))
都是 49,它们是一样的,所以问题是什么?
如评论中所述,您在此处的尝试存在一些问题。您不能从不同的 data
来源获取 x
和 y
美学,因此您应该先将它们合并。这也意味着在合并之前旋转原始数据(我使用 t()
时它仍然是矩阵),因此每个样本都是一行。为了证明这一点,我编写了一些响应元数据。
library(tidyverse)
d <- structure(c(8.10937548981953e-20, 0.095381661829093, 0.054868371418562, 0.0523687378840825, 0.0100173293159538, 0.0332395245437795, 3.37811149975583e-20, 0.048191378909587, 0.13314908462763, 0, 0.00612878313809124, 0, 0.00209460699328093, 0.205599458004829, 0.318048653115709, 4.21796249339787e-05, 0.00844407692255898, 0, 0.00613007026042523, 0.0300024082993193, 0.0405191646567986, 0.00654087887823056, 0.0111094954094255, 1.30617589099212e-19, 0.0398730537850546, 0.0390946117756341, 0.239413780024853, 2.07521807718399e-19, 0.00116980239850497, 0, 0, 0.00971921247886335, 0.0588607291731613, 3.8563512241696e-21, 0.00247621905821516, 0), .Dim = c(6L, 6L), .Dimnames = list(c("Adipocytes", "B-cells", "Basophils", "CD4+ memory T-cells", "CD4+ naive T-cells", "CD4+ T-cells"), c("Pt1", "Pt10", "Pt103", "Pt106", "Pt11", "Pt17")))
metadata <- data.frame(sample = c("Pt1", "Pt10", "Pt103", "Pt106", "Pt11", "Pt17"), response = c(T, T, T, F, F, F))
d %>%
t() %>%
as.data.frame() %>%
rownames_to_column("sample") %>%
right_join(metadata, .) %>%
ggplot(aes(x = response, y = `CD4+ memory T-cells`)) +
geom_boxplot(aes(fill = response)) +
scale_fill_manual(values = c('red','lightblue'))
#> Joining, by = "sample"
由 reprex package (v2.0.1)
于 2022-02-16 创建
我有一个名为 cells
的数据框,行中有单元格,列中有样本,这里是一个 dput 样本:
structure(c(8.10937548981953e-20, 0.095381661829093, 0.054868371418562,
0.0523687378840825, 0.0100173293159538, 0.0332395245437795, 3.37811149975583e-20,
0.048191378909587, 0.13314908462763, 0, 0.00612878313809124,
0, 0.00209460699328093, 0.205599458004829, 0.318048653115709,
4.21796249339787e-05, 0.00844407692255898, 0, 0.00613007026042523,
0.0300024082993193, 0.0405191646567986, 0.00654087887823056,
0.0111094954094255, 1.30617589099212e-19, 0.0398730537850546,
0.0390946117756341, 0.239413780024853, 2.07521807718399e-19,
0.00116980239850497, 0, 0, 0.00971921247886335, 0.0588607291731613,
3.8563512241696e-21, 0.00247621905821516, 0), .Dim = c(6L, 6L
), .Dimnames = list(c("Adipocytes", "B-cells", "Basophils", "CD4+ memory T-cells",
"CD4+ naive T-cells", "CD4+ T-cells"), c("Pt1", "Pt10", "Pt103",
"Pt106", "Pt11", "Pt17")))
cells
中的行数为 38,列数为 49。
我要为其构建箱线图的目标单元格类型是:CD4+ memory T-cells
此外,还有另一个名为 Metadata
的数据框,其中包含响应列,如果样本对某种治疗有反应 - Metadata$Benefit
。所以对于 cells
中的每个样本,我可以知道他们是否回应。
目标:用 Response/No 响应在 x 轴上和 y 轴上的值制作一个箱线图,我需要箱线图上面有单独的数据点.
我的代码:
cells %>%
ggplot( aes(x = as.factor(Metadata$Benefit), y = as.numeric(cells['CD4+ memory T-cells',]), fill= c('red','lightblue'))) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=11)
) +
ggtitle("A boxplot with jitter") +
xlab("")
错误:Aesthetics must be either length 1 or the same as the data (38): x, y and fill
我不明白问题是什么,因为 length(as.factor(Metadata$Benefit))
和 length(as.numeric(cells['CD4+ memory T-cells',]))
都是 49,它们是一样的,所以问题是什么?
如评论中所述,您在此处的尝试存在一些问题。您不能从不同的 data
来源获取 x
和 y
美学,因此您应该先将它们合并。这也意味着在合并之前旋转原始数据(我使用 t()
时它仍然是矩阵),因此每个样本都是一行。为了证明这一点,我编写了一些响应元数据。
library(tidyverse)
d <- structure(c(8.10937548981953e-20, 0.095381661829093, 0.054868371418562, 0.0523687378840825, 0.0100173293159538, 0.0332395245437795, 3.37811149975583e-20, 0.048191378909587, 0.13314908462763, 0, 0.00612878313809124, 0, 0.00209460699328093, 0.205599458004829, 0.318048653115709, 4.21796249339787e-05, 0.00844407692255898, 0, 0.00613007026042523, 0.0300024082993193, 0.0405191646567986, 0.00654087887823056, 0.0111094954094255, 1.30617589099212e-19, 0.0398730537850546, 0.0390946117756341, 0.239413780024853, 2.07521807718399e-19, 0.00116980239850497, 0, 0, 0.00971921247886335, 0.0588607291731613, 3.8563512241696e-21, 0.00247621905821516, 0), .Dim = c(6L, 6L), .Dimnames = list(c("Adipocytes", "B-cells", "Basophils", "CD4+ memory T-cells", "CD4+ naive T-cells", "CD4+ T-cells"), c("Pt1", "Pt10", "Pt103", "Pt106", "Pt11", "Pt17")))
metadata <- data.frame(sample = c("Pt1", "Pt10", "Pt103", "Pt106", "Pt11", "Pt17"), response = c(T, T, T, F, F, F))
d %>%
t() %>%
as.data.frame() %>%
rownames_to_column("sample") %>%
right_join(metadata, .) %>%
ggplot(aes(x = response, y = `CD4+ memory T-cells`)) +
geom_boxplot(aes(fill = response)) +
scale_fill_manual(values = c('red','lightblue'))
#> Joining, by = "sample"
由 reprex package (v2.0.1)
于 2022-02-16 创建