tidyr 跨多个列加入一个 ID table 和 main table

tidyr join an ID table with main table across multiple columns

这似乎是一个非常基本的操作,但我的搜索没有找到简单的解决方案。 作为我正在尝试做的事情的示例,请考虑数据库中的以下两个数据框。 首先是一个 ID table,它将索引分配给颜色名称:

ColorID <- tibble(ID = c(1:4), Name = c("Red", "Green", "Blue", "Black"))

ColorID
# A tibble: 4 x 2
     ID Name 
  <int> <chr>
1     1 Red  
2     2 Green
3     3 Blue 
4     4 Black

接下来一些 table 指向这些颜色索引(而不是存储文本字符串):

Widgets <- tibble(Front = c(1,3,4,2,1,1), Back = c(4,4,3,3,1,2), 
                  Top = c(4,3,2,1,2,3), Bottom = c(1,2,3,4,3,2))
Widgets
# A tibble: 6 x 4
  Front  Back   Top Bottom
  <dbl> <dbl> <dbl>  <dbl>
1     1     4     4      1
2     3     4     3      2
3     4     3     2      3
4     2     3     1      4
5     1     1     2      3
6     1     2     3      2

现在我只想加入两个 table 以用实际颜色名称替换索引值,所以我想要的是:

Joined <- tibble(Front = c("Red", "Blue", "Black", "Green", "Red","Red"),
                 Back = c("Black", "Black", "Blue","Blue", "Red", "Green"),
                 Top = c("Black","Blue", "Green", "Red", "Green", "Blue"),
                 Bottom = c("Red", "Green", "Blue", "Black", "Blue","Green"))
Joined
# A tibble: 6 x 4
  Front Back  Top   Bottom
  <chr> <chr> <chr> <chr> 
1 Red   Black Black Red   
2 Blue  Black Blue  Green 
3 Black Blue  Green Blue  
4 Green Blue  Red   Black 
5 Red   Red   Green Blue  
6 Red   Green Blue  Green 

我试了很多次都没有成功,我认为可行的是:

J <- Widgets %>% inner_join(ColorID, by = c(. = "ID"))

我可以通过一次使用一个变量逐列处理这一列,例如

J <- Widgets %>% inner_join(ColorID, by = c("Front" = "ID"))

它不会替换“Front”,而是创建一个新的“Name”列。似乎必须有一个简单的解决方案来解决这个问题。谢谢。

这个有用吗:

library(dplyr)
library(tidyr)

Widgets %>% pivot_longer(everything()) %>% 
  inner_join(ColorID, by = c('value' = 'ID')) %>% select(-value) %>% 
    pivot_wider(names_from = name, values_from = Name) %>% unnest(everything())
# A tibble: 6 x 4
  Front Back  Top   Bottom
  <chr> <chr> <chr> <chr> 
1 Red   Black Black Red   
2 Blue  Black Blue  Green 
3 Black Blue  Green Blue  
4 Green Blue  Red   Black 
5 Red   Red   Green Blue  
6 Red   Green Blue  Green 

不需要连接函数:

library(dplyr)

ColorID <- tibble(ID = c(1:4), Name = c("Red", "Green", "Blue", "Black"))
# reorder so that row number and ID are different
ColorID <- ColorID[c(2, 1, 4, 3), ] 

Widgets <- tibble(Front = c(1,3,4,2,1,1), Back = c(4,4,3,3,1,2), 
                  Top = c(4,3,2,1,2,3), Bottom = c(1,2,3,4,3,2))

check_id <- function(col){
  ColorID$Name[match(col, ColorID$ID)]
}

Widgets %>% 
  mutate(across(everything(), check_id))

# A tibble: 6 x 4
  Front Back  Top   Bottom
  <chr> <chr> <chr> <chr> 
1 Red   Black Black Red   
2 Blue  Black Blue  Green 
3 Black Blue  Green Blue  
4 Green Blue  Red   Black 
5 Red   Red   Green Blue  
6 Red   Green Blue  Green 

(已编辑)我对 dplyr 和 mutate 所做的是将 Widgets 上的数字与 ColorID$ID 列上的数字相匹配。这为我提供了提取名称所需的 ColorID 数据框上的行。