在 R 中创建滚动壁计数变量
Creating a Rolling Wall Count Variable in R
有一个包含大约 21k 个观测值的数据集,每个观测值都有一个分类变量,选项 A、B 和 C。我希望为之前在先前观测中采用选项 C 的国家/地区创建一个经验变量( case t-1 更简单)。有人告诉我这称为滚动墙计数。我一直无法弄清楚如何去做这个或最好使用什么包。任何建议都会非常有帮助!
dispute=c("1","1","1","2","2","2","2","3","3","3")
partner=c("1","2","3","1","2","3","4","2","1","3")
position=c("A","C","C","B","C","A","C","B","C","C")
目前我的数据是这样的:
Dispute Partner Position
1 1 A
1 2 C
1 3 C
2 1 B
2 2 C
2 3 A
2 4 C
3 1 B
3 2 C
3 3 C
理想情况下,我创建一个变量,当每个唯一观察值取值 C 时累积计数(为每个唯一 "partner"
生成一个 "experience" 计数
Dispute Partner Position Experience
1 1 A NA
1 2 C 1
1 3 C 1
2 1 B NA
2 2 C 2
2 3 A NA
2 4 C 1
3 1 B NA
3 2 C 3
和data.table
library(data.table)
setDT(df)[, experience:=cumsum(position=="C")*(position=="C"), by=partner]
dispute partner position experience
1: 1 1 A 0
2: 1 2 C 1
3: 1 3 C 1
4: 2 1 B 0
5: 2 2 C 2
6: 2 3 A 0
7: 2 4 C 1
8: 3 2 B 0
9: 3 1 C 1
10: 3 3 C 2
和dplyr
library(dplyr)
df %>%
group_by(partner) %>%
mutate(experience=cumsum(position=="C")*(position=="C"))
dispute partner position experience
1 1 1 A 0
2 1 2 C 1
3 1 3 C 1
4 2 1 B 0
5 2 2 C 2
6 2 3 A 0
7 2 4 C 1
8 3 2 B 0
9 3 1 C 1
10 3 3 C 2
数据
df <- data.frame(dispute=c("1","1","1","2","2","2","2","3","3","3"),
partner=c("1","2","3","1","2","3","4","2","1","3"),
position=c("A","C","C","B","C","A","C","B","C","C"))
有一个包含大约 21k 个观测值的数据集,每个观测值都有一个分类变量,选项 A、B 和 C。我希望为之前在先前观测中采用选项 C 的国家/地区创建一个经验变量( case t-1 更简单)。有人告诉我这称为滚动墙计数。我一直无法弄清楚如何去做这个或最好使用什么包。任何建议都会非常有帮助!
dispute=c("1","1","1","2","2","2","2","3","3","3")
partner=c("1","2","3","1","2","3","4","2","1","3")
position=c("A","C","C","B","C","A","C","B","C","C")
目前我的数据是这样的:
Dispute Partner Position
1 1 A
1 2 C
1 3 C
2 1 B
2 2 C
2 3 A
2 4 C
3 1 B
3 2 C
3 3 C
理想情况下,我创建一个变量,当每个唯一观察值取值 C 时累积计数(为每个唯一 "partner"
生成一个 "experience" 计数Dispute Partner Position Experience
1 1 A NA
1 2 C 1
1 3 C 1
2 1 B NA
2 2 C 2
2 3 A NA
2 4 C 1
3 1 B NA
3 2 C 3
和data.table
library(data.table)
setDT(df)[, experience:=cumsum(position=="C")*(position=="C"), by=partner]
dispute partner position experience
1: 1 1 A 0
2: 1 2 C 1
3: 1 3 C 1
4: 2 1 B 0
5: 2 2 C 2
6: 2 3 A 0
7: 2 4 C 1
8: 3 2 B 0
9: 3 1 C 1
10: 3 3 C 2
和dplyr
library(dplyr)
df %>%
group_by(partner) %>%
mutate(experience=cumsum(position=="C")*(position=="C"))
dispute partner position experience
1 1 1 A 0
2 1 2 C 1
3 1 3 C 1
4 2 1 B 0
5 2 2 C 2
6 2 3 A 0
7 2 4 C 1
8 3 2 B 0
9 3 1 C 1
10 3 3 C 2
数据
df <- data.frame(dispute=c("1","1","1","2","2","2","2","3","3","3"),
partner=c("1","2","3","1","2","3","4","2","1","3"),
position=c("A","C","C","B","C","A","C","B","C","C"))