在 R 中使用 group_by 删除一组行

Deleting set of rows using group_by in R

在下面的 10 行数据框中,我目击了鲸鱼或船只,这些目击事件按 ScanID 分组。

通过使用 dyplr 库,我试图找到一种方法来删除没有任何鲸鱼的扫描,在这种情况下,它将扫描 2 和 5。

我认为 group_by 会很有用,但我不确定如何从那里继续。

whales <- data.frame(rubbing.beach = c('whale', 'vessel', 'vessel', 'vessel', 'whale', 'whale', 'whale', 'vessel', 'vessel', 'whale'), 
ScanID = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 6))
X Target ScanID
1 whale 1
2 vessel 1
3 vessel 2
4 vessel 2
5 whale 3
6 whale 3
7 whale 4
8 vessel 4
9 vessel 5
10 whale 6

给我留下以下输出:

X Target ScanID
1 whale 1
2 vessel 1
3 whale 3
4 whale 3
5 whale 4
6 vessel 4
7 whale 6

group_by确实需要考虑每个Scan ID,filter用来指定保留哪些行:

whales = read.table(text =
'X  Target  ScanID
1   whale   1
2   vessel  1
3   vessel  2
4   vessel  2
5   whale   3
6   whale   3
7   whale   4
8   vessel  4
9   vessel  5
10  whale   6', header = T)

library(dplyr)
whales %>%
  group_by(ScanID) %>%
  filter("whale" %in% Target)
# # A tibble: 7 × 3
# # Groups:   ScanID [4]
#       X Target ScanID
#   <int> <chr>   <int>
# 1     1 whale       1
# 2     2 vessel      1
# 3     5 whale       3
# 4     6 whale       3
# 5     7 whale       4
# 6     8 vessel      4
# 7    10 whale       6

我认为你可以在没有 group_by 的情况下通过使用 rubbing.beach == "whale" 提取所有 ScanID 并在 subset.

中使用它来做到这一点
subset(whales, ScanID %in% unique(ScanID[rubbing.beach == "whale"]))

#  rubbing.beach ScanID
#1         whale      1
#2        vessel      1
#3         whale      3
#4         whale      3
#5         whale      4
#6        vessel      4
#7         whale      6

dplyr中,我们可以使用filter-

library(dplyr)

whales %>% filter(ScanID %in% unique(ScanID[rubbing.beach == "whale"]))