坚持在 R 中创建虚拟变量

Stuck on making a dummy variable in R

如果满足两个列条件但它不起作用,我正在尝试创建一个虚拟变量。因此,例如,在我的数据中,如果 firm_state 是 CA、MA、MD、ME...等并且我的州是 CA、MA、MD、ME 等,我希望虚拟变量为 1。所以在我的图片中,我希望状态为 MD 或 ME 且 firm_state 为 CA 的行具有虚拟变量“1”,而其他状态为 AZ 或 TX 且 firm_state 为 CA有“0”。但是,当我编写代码时,我只是将 same_party 中的所有内容都视为具有虚拟变量“1”。有人可以告诉我哪里出错了吗?这是我当前的代码。

data <- data %>%
mutate(same_party = ifelse(firm_state == "CA" | firm_state == "CO" | 
firm_state == "NY" & State == "MD" | 
State == "ME" | State == "WA", 1, 0))

问题似乎是你的条件没有很好地分开或分组。这意味着你有一个“或”链,中间有一个“和”。使用一对括号应该可以解决您的问题:

data <- data %>%
  mutate(same_party = ifelse((firm_state == "CA" | firm_state == "CO" | 
                               firm_state == "NY") & (State == "MD" | 
                               State == "ME" | State == "WA"), 1, 0))

更容易理解的是使用 %in% 运算符:

data <- data %>%
  mutate(same_party = ifelse(firm_state %in% c("CA","CO","NY") & State %in% c("MD","ME","WA"), 1, 0))

您可以使用 %in% 而不是大量 ==:

data <- data %>%
  mutate(same_party = ifelse(firm_state %in% c("CA","CO","NY") &
                             State %in% c("MD","ME","WA"),1,0))

虚拟数据:

data = data.frame(
  State = sample(c("AZ","LA","MD","ME","MD","WA"),10, TRUE),
  firm_state = sample(c("CA","CO","NY"), 10, TRUE))

输出:

   State firm_state same_party
1     WA         CA          1
2     MD         CO          1
3     MD         CA          1
4     LA         CA          0
5     MD         NY          1
6     ME         NY          1
7     MD         NY          1
8     MD         NY          1
9     LA         NY          0
10    ME         NY          1