如何用列名重新编码虚拟列?

how to recode dummy column with the column name?

我有 input 数据集,我正在寻找生成 output 数据集的方法,方法是将 1 重新编码为列名,将 0 重新编码为 NA。我设法手动完成,请参见下面的 Not optional solution。但是我有一个包含数百列的数据集,所以我正在寻找一种方法来自动化这个过程。

套餐

library(tibble)
library(dplyr)

输入

input <- tibble( a = c(1, 0, 0, 1, 0),
                 b = c(0, 0, 0, 1, 1),
                 c = c(1, 1, 1, 1, 1),
                 d = c(0, 0, 0, 0, 0))


# # A tibble: 5 × 4
#       a     b     c     d
#   <dbl> <dbl> <dbl> <dbl>
# 1     1     0     1     0
# 2     0     0     1     0
# 3     0     0     1     0
# 4     1     1     1     0
# 5     0     1     1     0

输出

output <- tibble( a = c("a", NA, NA, "a", NA),
                  b = c(NA, NA, NA, "b", NA),
                  c = c("c", "c", "c", "c", "c"),
                  d = c(NA, NA, NA, NA, NA))

   
# # A tibble: 5 × 4
#       a     b     c     d    
#   <chr> <chr> <chr> <lgl>
# 1 a     NA    c     NA   
# 2 NA    NA    c     NA   
# 3 NA    NA    c     NA   
# 4 a     b     c     NA   
# 5 NA    NA    c     NA 

不是可选的解决方案

input %>% 
  mutate(a = case_when(a == 1 ~ "a",
                       T ~ NA_character_),
         b = case_when(b == 1 ~ "b",
                       T ~ NA_character_),
         c = case_when(c == 1 ~ "c",
                       T ~ NA_character_),
         d = case_when(d == 1 ~ "d",
                       T ~ NA_character_))

我们可以将 acrossifelse 语句一起使用:

library(dplyr)

input %>% 
  mutate(across(everything(), ~ifelse(. == 1, cur_column(), NA)))
  a     b     c     d    
  <chr> <chr> <chr> <lgl>
1 a     NA    c     NA   
2 NA    NA    c     NA   
3 NA    NA    c     NA   
4 a     b     c     NA   
5 NA    b     c     NA