坚持在 R 中创建虚拟变量
Stuck on making a dummy variable in R
如果满足两个列条件但它不起作用,我正在尝试创建一个虚拟变量。因此,例如,在我的数据中,如果 firm_state 是 CA、MA、MD、ME...等并且我的州是 CA、MA、MD、ME 等,我希望虚拟变量为 1。所以在我的图片中,我希望状态为 MD 或 ME 且 firm_state 为 CA 的行具有虚拟变量“1”,而其他状态为 AZ 或 TX 且 firm_state 为 CA有“0”。但是,当我编写代码时,我只是将 same_party 中的所有内容都视为具有虚拟变量“1”。有人可以告诉我哪里出错了吗?这是我当前的代码。
data <- data %>%
mutate(same_party = ifelse(firm_state == "CA" | firm_state == "CO" |
firm_state == "NY" & State == "MD" |
State == "ME" | State == "WA", 1, 0))
问题似乎是你的条件没有很好地分开或分组。这意味着你有一个“或”链,中间有一个“和”。使用一对括号应该可以解决您的问题:
data <- data %>%
mutate(same_party = ifelse((firm_state == "CA" | firm_state == "CO" |
firm_state == "NY") & (State == "MD" |
State == "ME" | State == "WA"), 1, 0))
更容易理解的是使用 %in% 运算符:
data <- data %>%
mutate(same_party = ifelse(firm_state %in% c("CA","CO","NY") & State %in% c("MD","ME","WA"), 1, 0))
您可以使用 %in%
而不是大量 ==
:
data <- data %>%
mutate(same_party = ifelse(firm_state %in% c("CA","CO","NY") &
State %in% c("MD","ME","WA"),1,0))
虚拟数据:
data = data.frame(
State = sample(c("AZ","LA","MD","ME","MD","WA"),10, TRUE),
firm_state = sample(c("CA","CO","NY"), 10, TRUE))
输出:
State firm_state same_party
1 WA CA 1
2 MD CO 1
3 MD CA 1
4 LA CA 0
5 MD NY 1
6 ME NY 1
7 MD NY 1
8 MD NY 1
9 LA NY 0
10 ME NY 1
如果满足两个列条件但它不起作用,我正在尝试创建一个虚拟变量。因此,例如,在我的数据中,如果 firm_state 是 CA、MA、MD、ME...等并且我的州是 CA、MA、MD、ME 等,我希望虚拟变量为 1。所以在我的图片中,我希望状态为 MD 或 ME 且 firm_state 为 CA 的行具有虚拟变量“1”,而其他状态为 AZ 或 TX 且 firm_state 为 CA有“0”。但是,当我编写代码时,我只是将 same_party 中的所有内容都视为具有虚拟变量“1”。有人可以告诉我哪里出错了吗?这是我当前的代码。
data <- data %>%
mutate(same_party = ifelse(firm_state == "CA" | firm_state == "CO" |
firm_state == "NY" & State == "MD" |
State == "ME" | State == "WA", 1, 0))
问题似乎是你的条件没有很好地分开或分组。这意味着你有一个“或”链,中间有一个“和”。使用一对括号应该可以解决您的问题:
data <- data %>%
mutate(same_party = ifelse((firm_state == "CA" | firm_state == "CO" |
firm_state == "NY") & (State == "MD" |
State == "ME" | State == "WA"), 1, 0))
更容易理解的是使用 %in% 运算符:
data <- data %>%
mutate(same_party = ifelse(firm_state %in% c("CA","CO","NY") & State %in% c("MD","ME","WA"), 1, 0))
您可以使用 %in%
而不是大量 ==
:
data <- data %>%
mutate(same_party = ifelse(firm_state %in% c("CA","CO","NY") &
State %in% c("MD","ME","WA"),1,0))
虚拟数据:
data = data.frame(
State = sample(c("AZ","LA","MD","ME","MD","WA"),10, TRUE),
firm_state = sample(c("CA","CO","NY"), 10, TRUE))
输出:
State firm_state same_party
1 WA CA 1
2 MD CO 1
3 MD CA 1
4 LA CA 0
5 MD NY 1
6 ME NY 1
7 MD NY 1
8 MD NY 1
9 LA NY 0
10 ME NY 1