汇总满足条件的第一行
Summarise first row meeting a condition
假设我有这个数据框:
df <- data.frame(
party = c("A", "A", "B", "A", "B"),
votes = c(100, 99, 98, 97, 96),
elected = c(1, 1, 1, 0, 0, 0)
)
party votes elected
1 A 100 1
2 A 99 1
3 B 98 1
4 A 97 0
5 B 96 0
I want to compute a new variable which is the votes of the challenger candidate, this is the votes of the first non-elected candidate form a different party.结果将是:
party votes elected votes_challenge
1 A 100 1 96
2 A 99 1 96
3 B 98 1 97
4 A 97 0 NA
5 B 96 0 NA
我已经尝试过 first()
和 lag()
使用 which()
的条件,但目前没有成功。非常感谢任何帮助。
这是使用 fuzzyjoin
-package
的一个选项
library(fuzzyjoin)
library(tidyverse)
fuzzy_left_join(df, df %>%
arrange(party, elected, desc(votes)) %>%
group_by(party) %>% slice(1) ,
by = c("party", "elected"), match_fun = list(`!=`, `>`)) %>%
select(ends_with("x"), votes.y)
party.x votes.x elected.x votes.y
1 A 100 1 96
2 A 99 1 96
3 B 98 1 97
4 A 97 0 NA
5 B 96 0 NA
也许这对你有用
你可以试试函数
library(dplyr)
get_opposite_votes <- function(df, group) {
df %>% filter(party != group & elected == 0) %>% slice(1L) %>% pull(votes)
}
df %>%
group_by(party) %>%
mutate(new = get_opposite_votes(., first(party))) %>%
ungroup() %>%
#If needed to have NA values where elected = 0
mutate(new = replace(new, elected == 0, NA))
# party votes elected new
# <fct> <dbl> <dbl> <dbl>
#1 A 100 1 96
#2 A 99 1 96
#3 B 98 1 97
#4 A 97 0 NA
#5 B 96 0 NA
假设我有这个数据框:
df <- data.frame(
party = c("A", "A", "B", "A", "B"),
votes = c(100, 99, 98, 97, 96),
elected = c(1, 1, 1, 0, 0, 0)
)
party votes elected
1 A 100 1
2 A 99 1
3 B 98 1
4 A 97 0
5 B 96 0
I want to compute a new variable which is the votes of the challenger candidate, this is the votes of the first non-elected candidate form a different party.结果将是:
party votes elected votes_challenge
1 A 100 1 96
2 A 99 1 96
3 B 98 1 97
4 A 97 0 NA
5 B 96 0 NA
我已经尝试过 first()
和 lag()
使用 which()
的条件,但目前没有成功。非常感谢任何帮助。
这是使用 fuzzyjoin
-package
library(fuzzyjoin)
library(tidyverse)
fuzzy_left_join(df, df %>%
arrange(party, elected, desc(votes)) %>%
group_by(party) %>% slice(1) ,
by = c("party", "elected"), match_fun = list(`!=`, `>`)) %>%
select(ends_with("x"), votes.y)
party.x votes.x elected.x votes.y
1 A 100 1 96
2 A 99 1 96
3 B 98 1 97
4 A 97 0 NA
5 B 96 0 NA
也许这对你有用
你可以试试函数
library(dplyr)
get_opposite_votes <- function(df, group) {
df %>% filter(party != group & elected == 0) %>% slice(1L) %>% pull(votes)
}
df %>%
group_by(party) %>%
mutate(new = get_opposite_votes(., first(party))) %>%
ungroup() %>%
#If needed to have NA values where elected = 0
mutate(new = replace(new, elected == 0, NA))
# party votes elected new
# <fct> <dbl> <dbl> <dbl>
#1 A 100 1 96
#2 A 99 1 96
#3 B 98 1 97
#4 A 97 0 NA
#5 B 96 0 NA