在 R 中使用 group_by 删除一组行
Deleting set of rows using group_by in R
在下面的 10 行数据框中,我目击了鲸鱼或船只,这些目击事件按 ScanID 分组。
通过使用 dyplr
库,我试图找到一种方法来删除没有任何鲸鱼的扫描,在这种情况下,它将扫描 2 和 5。
我认为 group_by
会很有用,但我不确定如何从那里继续。
whales <- data.frame(rubbing.beach = c('whale', 'vessel', 'vessel', 'vessel', 'whale', 'whale', 'whale', 'vessel', 'vessel', 'whale'),
ScanID = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 6))
X
Target
ScanID
1
whale
1
2
vessel
1
3
vessel
2
4
vessel
2
5
whale
3
6
whale
3
7
whale
4
8
vessel
4
9
vessel
5
10
whale
6
给我留下以下输出:
X
Target
ScanID
1
whale
1
2
vessel
1
3
whale
3
4
whale
3
5
whale
4
6
vessel
4
7
whale
6
group_by
确实需要考虑每个Scan ID,filter
用来指定保留哪些行:
whales = read.table(text =
'X Target ScanID
1 whale 1
2 vessel 1
3 vessel 2
4 vessel 2
5 whale 3
6 whale 3
7 whale 4
8 vessel 4
9 vessel 5
10 whale 6', header = T)
library(dplyr)
whales %>%
group_by(ScanID) %>%
filter("whale" %in% Target)
# # A tibble: 7 × 3
# # Groups: ScanID [4]
# X Target ScanID
# <int> <chr> <int>
# 1 1 whale 1
# 2 2 vessel 1
# 3 5 whale 3
# 4 6 whale 3
# 5 7 whale 4
# 6 8 vessel 4
# 7 10 whale 6
我认为你可以在没有 group_by
的情况下通过使用 rubbing.beach == "whale"
提取所有 ScanID
并在 subset
.
中使用它来做到这一点
subset(whales, ScanID %in% unique(ScanID[rubbing.beach == "whale"]))
# rubbing.beach ScanID
#1 whale 1
#2 vessel 1
#3 whale 3
#4 whale 3
#5 whale 4
#6 vessel 4
#7 whale 6
在dplyr
中,我们可以使用filter
-
library(dplyr)
whales %>% filter(ScanID %in% unique(ScanID[rubbing.beach == "whale"]))
在下面的 10 行数据框中,我目击了鲸鱼或船只,这些目击事件按 ScanID 分组。
通过使用 dyplr
库,我试图找到一种方法来删除没有任何鲸鱼的扫描,在这种情况下,它将扫描 2 和 5。
我认为 group_by
会很有用,但我不确定如何从那里继续。
whales <- data.frame(rubbing.beach = c('whale', 'vessel', 'vessel', 'vessel', 'whale', 'whale', 'whale', 'vessel', 'vessel', 'whale'),
ScanID = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 6))
X | Target | ScanID |
---|---|---|
1 | whale | 1 |
2 | vessel | 1 |
3 | vessel | 2 |
4 | vessel | 2 |
5 | whale | 3 |
6 | whale | 3 |
7 | whale | 4 |
8 | vessel | 4 |
9 | vessel | 5 |
10 | whale | 6 |
给我留下以下输出:
X | Target | ScanID |
---|---|---|
1 | whale | 1 |
2 | vessel | 1 |
3 | whale | 3 |
4 | whale | 3 |
5 | whale | 4 |
6 | vessel | 4 |
7 | whale | 6 |
group_by
确实需要考虑每个Scan ID,filter
用来指定保留哪些行:
whales = read.table(text =
'X Target ScanID
1 whale 1
2 vessel 1
3 vessel 2
4 vessel 2
5 whale 3
6 whale 3
7 whale 4
8 vessel 4
9 vessel 5
10 whale 6', header = T)
library(dplyr)
whales %>%
group_by(ScanID) %>%
filter("whale" %in% Target)
# # A tibble: 7 × 3
# # Groups: ScanID [4]
# X Target ScanID
# <int> <chr> <int>
# 1 1 whale 1
# 2 2 vessel 1
# 3 5 whale 3
# 4 6 whale 3
# 5 7 whale 4
# 6 8 vessel 4
# 7 10 whale 6
我认为你可以在没有 group_by
的情况下通过使用 rubbing.beach == "whale"
提取所有 ScanID
并在 subset
.
subset(whales, ScanID %in% unique(ScanID[rubbing.beach == "whale"]))
# rubbing.beach ScanID
#1 whale 1
#2 vessel 1
#3 whale 3
#4 whale 3
#5 whale 4
#6 vessel 4
#7 whale 6
在dplyr
中,我们可以使用filter
-
library(dplyr)
whales %>% filter(ScanID %in% unique(ScanID[rubbing.beach == "whale"]))