分组箱线图上的突出显示点

Highlight points on grouped box plot

这是一个不同的问题,但紧随其后:

已更新

我的数据集如下所示:

Term Name True Result Gender
T1 Name1 True 4 F
T2 Name2 False 6 F
T3 Name3 True 5.5 M
T3 Name4 False 4.6 M

测试数据集:

dataset_test <- structure(list(Term = c("T1", "T1", "T1", "T1", "T1", "T1", "T2", 
"T2", "T2", "T2", "T2", "T2", "T2", "T3", "T3", "T3", "T3", "T3", 
"T3", "T3"), Name = c("Name1", "Name2", "Name3", "Name4", "Name5", 
"Name6", "Name5", "Name6", "Name7", "Name8", "Name9", "Name10", 
"Name11", "Name12", "Name13", "Name14", "Name15", "Name16", "Name17", 
"Name18"), TRUE. = c(TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, 
TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
FALSE, TRUE, TRUE), Result = c(4, 5, 6, 4, 5, 6, 5.5, 4.6, 5.5, 
4.6, 5, 5.2, 6, 5.5, 4, 5.5, 4.8, 5, 5, 4.4), Gender = c("F", 
"F", "F", "M", "M", "M", "F", "F", "F", "F", "M", "M", "M", "F", 
"F", "F", "F", "M", "M", "M")), class = "data.frame", row.names = c(NA, 
-20L))

我在下面有一个按性别分组的箱线图。我希望能够在正确的性别箱线图中突出显示点,即这些点需要与 True 记录的性别对齐。

解决方案归功于 chemdork123

dataset_test %>% 
  group_by(Term) %>% 
  filter(any(TRUE.)) %>%
  ggplot(aes(x = Term, y = Result, fill = Gender)) + 
  scale_fill_brewer(palette = "Blues") +
  geom_boxplot(position=position_dodge(0.8))+
  geom_point(                               # add the highlight points
    data=subset(dataset_test, TRUE. == TRUE), 
    aes(x=Term, y=Result), position=position_dodge(0.8),
    color="blue", size=4, show.legend = FALSE) +
  ggtitle("Distribution of results by term") +
  xlab("Term ") + ylab("Result)")

如果男女都有真实记录,位置闪避现在可以完美运行。但是如果只有一个就会中断。然而,这是此可视化的主要用例。

上面的代码产生这个:

再次感谢任何帮助。

您可能很接近:您需要在 geom_point() 调用中使用 position_dodge。为了确保这些点与箱线图的位置正确对齐,您还应该为箱线图 geom 明确定义 widthposition_dodge。我还在此处包括 show.legend=FALSE for geom_point(),因为您可能不希望像示例中那样在图例上显示蓝点:

dataset %>% 
  group_by(Term) %>% 
  filter(any(TRUE.)) %>%
  ggplot(aes(x = Term, y = Result, fill = Gender)) + 
  scale_fill_brewer(palette = "Blues") +
  geom_boxplot(position=position_dodge(0.8))+
  geom_point(                               # add the highlight points
    data=subset(dataset, TRUE. == TRUE), 
    aes(x=Term, y=Result), position=position_dodge(0.8),
    color="blue", size=4, show.legend = FALSE) +
  ggtitle("Distribution of results by term") +
  xlab("Term ") + ylab("Result)")