R:与另一个因素的两个水平同时出现的子集因素水平

R: Subset factor levels that co-occur with two levels from another factor

我有一个由多列组成的数据框。我想将数据框子集化为仅包含一个因素的水平与另一个因素的多个水平同时出现的行。使用下面的简化数据示例,我将只剩下前两行,即 GeneA、GeneA 和 TissueA TissueB。

A <- c("GeneA","GeneA","GeneB","GeneB","GeneC","GeneC")
B <- c("TissueA","TissueB","TissueA","TissueA","TissueA","TissueA")
df <- data.frame(Gene = A, Tissue = B)

提前致谢。

这是一个想法。您使用 Gene 定义组。在每一组中,您想检查是否有多个唯一值。

group_by(df, Gene) %>% 
filter(n_distinct(Tissue) >= 2)

   Gene  Tissue 
  <fct> <fct>  
1 GeneA TissueA
2 GeneA TissueB