R更新具有多个连接操作的单列

R update single column with multiple join operations

我有

df1 <- data.frame(A = c("Andy", "Tim","Joe","Mike"), B = c("Andrew", NA,NA,"Michael"))

df2 <- data.frame(A = c("Andy", "Tim","Michael"), status = c("sent", "sent","sent"))

我想左连接 df1 到 df2,df1 中的 A 或 B 匹配 df2 中的 A。结果将是:

result <- data.frame(A = c("Andy", "Tim","Joe","Mike"), B = c("Andrew", NA,NA,"Michael"), status = c("sent", "sent", NA, "sent"))
library(dplyr)
left_join(df1, df2, by = c("A" = "A")) %>%
  left_join(df2, by = c("B" = "A")) %>%
  mutate(status = coalesce(status.x, status.y)) %>%
  select(-status.x, -status.y)
#      A       B status
# 1 Andy  Andrew   sent
# 2  Tim    <NA>   sent
# 3  Joe    <NA>   <NA>
# 4 Mike Michael   sent

如果您在 df1 中有很多要加入的列,您可以获得长格式的数据。

library(dplyr)
library(tidyr)

df1 %>%
  mutate(row = row_number()) %>%
  pivot_longer(cols = -row, 
               values_drop_na = TRUE) %>%
  left_join(df2, by = c('value' = 'A')) %>%
  group_by(row) %>%
  fill(status, .direction = 'updown') %>%
  pivot_wider() %>%
  ungroup() %>%
  select(-row)

#  status  A     B      
#  <chr>  <chr> <chr>  
#1 sent   Andy  Andrew 
#2 sent   Tim   NA     
#3 NA     Joe   NA     
#4 sent   Mike  Michael