在 R 中使用 ggplot2 绘制箱线图

Boxplot with ggplot2 in R

我有一个名为 cells 的数据框,行中有单元格,列中有样本,这里是一个 dput 样本:

structure(c(8.10937548981953e-20, 0.095381661829093, 0.054868371418562, 
0.0523687378840825, 0.0100173293159538, 0.0332395245437795, 3.37811149975583e-20, 
0.048191378909587, 0.13314908462763, 0, 0.00612878313809124, 
0, 0.00209460699328093, 0.205599458004829, 0.318048653115709, 
4.21796249339787e-05, 0.00844407692255898, 0, 0.00613007026042523, 
0.0300024082993193, 0.0405191646567986, 0.00654087887823056, 
0.0111094954094255, 1.30617589099212e-19, 0.0398730537850546, 
0.0390946117756341, 0.239413780024853, 2.07521807718399e-19, 
0.00116980239850497, 0, 0, 0.00971921247886335, 0.0588607291731613, 
3.8563512241696e-21, 0.00247621905821516, 0), .Dim = c(6L, 6L
), .Dimnames = list(c("Adipocytes", "B-cells", "Basophils", "CD4+ memory T-cells", 
"CD4+ naive T-cells", "CD4+ T-cells"), c("Pt1", "Pt10", "Pt103", 
"Pt106", "Pt11", "Pt17")))

cells 中的行数为 38,列数为 49。 我要为其构建箱线图的目标单元格类型是:CD4+ memory T-cells

此外,还有另一个名为 Metadata 的数据框,其中包含响应列,如果样本对某种治疗有反应 - Metadata$Benefit。所以对于 cells 中的每个样本,我可以知道他们是否回应。

目标:用 Response/No 响应在 x 轴上和 y 轴上的值制作一个箱线图,我需要箱线图上面有单独的数据点.

我的代码:

cells %>%
  ggplot( aes(x = as.factor(Metadata$Benefit), y = as.numeric(cells['CD4+ memory T-cells',]), fill= c('red','lightblue'))) +
  geom_boxplot() +
  scale_fill_viridis(discrete = TRUE, alpha=0.6) +
  geom_jitter(color="black", size=0.4, alpha=0.9) +
  theme_ipsum() +
  theme(
    legend.position="none",
    plot.title = element_text(size=11)
  ) +
  ggtitle("A boxplot with jitter") +
  xlab("")

错误Aesthetics must be either length 1 or the same as the data (38): x, y and fill

我不明白问题是什么,因为 length(as.factor(Metadata$Benefit))length(as.numeric(cells['CD4+ memory T-cells',])) 都是 49,它们是一样的,所以问题是什么?

如评论中所述,您在此处的尝试存在一些问题。您不能从不同的 data 来源获取 xy 美学,因此您应该先将它们合并。这也意味着在合并之前旋转原始数据(我使用 t() 时它仍然是矩阵),因此每个样本都是一行。为了证明这一点,我编写了一些响应元数据。

library(tidyverse)

d <- structure(c(8.10937548981953e-20, 0.095381661829093, 0.054868371418562, 0.0523687378840825, 0.0100173293159538, 0.0332395245437795, 3.37811149975583e-20, 0.048191378909587, 0.13314908462763, 0, 0.00612878313809124, 0, 0.00209460699328093, 0.205599458004829, 0.318048653115709, 4.21796249339787e-05, 0.00844407692255898, 0, 0.00613007026042523, 0.0300024082993193, 0.0405191646567986, 0.00654087887823056, 0.0111094954094255, 1.30617589099212e-19, 0.0398730537850546, 0.0390946117756341, 0.239413780024853, 2.07521807718399e-19, 0.00116980239850497, 0, 0, 0.00971921247886335, 0.0588607291731613, 3.8563512241696e-21, 0.00247621905821516, 0), .Dim = c(6L, 6L), .Dimnames = list(c("Adipocytes", "B-cells", "Basophils", "CD4+ memory T-cells", "CD4+ naive T-cells", "CD4+ T-cells"), c("Pt1", "Pt10", "Pt103", "Pt106", "Pt11", "Pt17")))

metadata <- data.frame(sample = c("Pt1", "Pt10", "Pt103", "Pt106", "Pt11", "Pt17"), response = c(T, T, T, F, F, F))

d %>% 
  t() %>% 
  as.data.frame() %>% 
  rownames_to_column("sample") %>% 
  right_join(metadata, .) %>% 
  ggplot(aes(x = response, y = `CD4+ memory T-cells`)) +
  geom_boxplot(aes(fill = response)) +
  scale_fill_manual(values = c('red','lightblue'))
#> Joining, by = "sample"

reprex package (v2.0.1)

于 2022-02-16 创建