如何将所有值(字符串)与所有其他 header 一起粘贴到一个数据列中,条件是值和 header 的组合

How to paste all values (strings) in one data column with all other headers, conditioned on the combination of value and header

我有一个如下所示的数据框:

library(tibble)

df_of_measures <-
  tribble(~measure, ~meter, ~cubic_ft, ~milliliter, ~mile, ~kilogram, ~pound,
        "volume", FALSE, TRUE, TRUE, FALSE, FALSE, FALSE,
        "distance", TRUE, FALSE, FALSE, TRUE, FALSE, FALSE,
        "mass", FALSE, FALSE, FALSE, FALSE, TRUE, TRUE)

##   measure  meter cubic_ft milliliter mile  kilogram pound
##   <chr>    <lgl> <lgl>    <lgl>      <lgl> <lgl>    <lgl>
## 1 volume   FALSE TRUE     TRUE       FALSE FALSE    FALSE
## 2 distance TRUE  FALSE    FALSE      TRUE  FALSE    FALSE
## 3 mass     FALSE FALSE    FALSE      FALSE TRUE     TRUE 

我想获取 measure 列并将其值与其他 headers 交叉,因此我只获得 TRUE 组合的矢量:

[1] "volume_cubic_ft"   "volume_milliliter" "distance_meter"    "distance_mile"     "mass_kilogram"     "mass_pound"

如果我不尝试 这样的操作是 TRUE 还是 FALSE,我会做的:

as.vector(outer(df_of_measures$measure, names(df_of_measures)[-1], paste, sep="_"))

##  [1] "volume_meter"        "distance_meter"      "mass_meter"          "volume_cubic_ft"     "distance_cubic_ft"   "mass_cubic_ft"      
##  [7] "volume_milliliter"   "distance_milliliter" "mass_milliliter"     "volume_mile"         "distance_mile"       "mass_mile"          
## [13] "volume_kilogram"     "distance_kilogram"   "mass_kilogram"       "volume_pound"        "distance_pound"      "mass_pound" 

我怎样才能得到只有 TRUE 组合的矢量?

这是一个带有 base R 的选项,其中使用 applyMARGIN = 1 循环遍历行,获取值为 TRUE 和 paste 与第一列或第一个元素值

c( apply(df_of_measures, 1, function(x) 
        paste(x[1], names(x)[-1][as.logical(x[-1])], sep="_")))

-输出

#[1] "volume_cubic_ft"   "volume_milliliter" "distance_meter"  
#[4]  "distance_mile"     "mass_kilogram"     "mass_pound" 

或使用 tidyverse,使用 pivot_longerfilter 基于 'value' TRUE 值和 unite 重塑为 'long' 格式measurename

library(dplyr)
library(tidyr)
df_of_measures %>% 
    pivot_longer(cols = -measure) %>%
    filter(value) %>%
    unite(measure, measure, name, sep="_") %>%
    pull(measure)
#[1] "volume_cubic_ft"   "volume_milliliter" "distance_meter"  
#[4] "distance_mile"     "mass_kilogram"     "mass_pound"    

使用 reshape2::melt

将宽变长
r <- reshape2::melt(df_of_measures, "measure", names(df_of_measures)[-1])
Reduce(paste0, c(r[r$value, 1:2], "_")[c(1, 3, 2)])
# [1] "distance_meter"    "volume_cubic_ft"  
# [3] "volume_milliliter" "distance_mile"    
# [5] "mass_kilogram"     "mass_pound"  

或基础reshape.

r <- reshape(as.data.frame(df_of_measures), idvar="measure", 
             times=names(df_of_measures)[-1], varying=2:7, v.names="x", direction="long")
Reduce(paste0, c(r[r$x, 1:2], "_")[c(1, 3, 2)])
# [1] "distance_meter"    "volume_cubic_ft"  
# [3] "volume_milliliter" "distance_mile"    
# [5] "mass_kilogram"     "mass_pound"