使用 R 中的列表和列表名称的值替换数据框列的值

Question

我想用列表名称替换列中的值，条件是列表值中的值：

df <- data.frame(Activity = c("Checking emails", "Playing games", "Reading", "Watching TV", 
                         "Watching YouTube", "Watching TV", "Relaxing", "Getting ready", 
                         "Working/ studying", "Relaxing"))

mylist <-list(Tech_activity = c("Browsing social media", "Checking emails", 
"Video calling", "On my computer/ PC", "Watching YouTube", "Browsing the internet", 
"On my phone", "Watching TV"), Socialising = c("Spending time with friends", 
"Chatting/ talking/ having a conversation", "Spending time with family"
), Work = "Working/ studying", Transport = c("Travelling", "Walking", 
"Driving"), Household = c("Housework", "Cooking"), Leisure = c("Exercising/ Working out", 
"Getting ready", "Exercising/ working out", "Hobbies eg knitting", 
"Playing games", "Shopping", "Eating", "Listening to music", 
"Reading", "Smoking", "Playing with pets", "Personal caring", 
"Personal care", "Nothing", "Relaxing", "Waiting"))

因此，如果数据框值在列表中某个元素的值中，则将 df 替换为该元素名称，如果不存在则跳过该元素并检查列表中的下一个元素，依此类推。（请原谅双for循环）。

for (i in df$Activity){
  for (j in mylist){
    if (i %in% mylist[j]){
      i <- names(mylist[j])
    }
  }
}

提前感谢您的帮助。

Answer 1

您可以将 mylist 作为数据框，然后 merge 使用 df。

merge(df, stack(mylist), by.x = 'Activity', by.y = 'values')

tidyverse 方式是：

library(tidyverse)

enframe(mylist) %>%
  unnest(value) %>%
  right_join(df, by = c('value' = 'Activity'))

#   name          value            
#   <chr>         <chr>            
# 1 Tech_activity Checking emails  
# 2 Tech_activity Watching YouTube 
# 3 Tech_activity Watching TV      
# 4 Tech_activity Watching TV      
# 5 Work          Working/ studying
# 6 Leisure       Getting ready    
# 7 Leisure       Playing games    
# 8 Leisure       Reading          
# 9 Leisure       Relaxing         
#10 Leisure       Relaxing

Answer 2

在基数 R 中：

matches <- unlist(lapply(mylist, function(x) which(df$Activity %in% x)))
df$Activity[matches] <- gsub("\d+$", "", names(matches))

df
#>         Activity
#> 1  Tech_activity
#> 2        Leisure
#> 3        Leisure
#> 4  Tech_activity
#> 5  Tech_activity
#> 6  Tech_activity
#> 7        Leisure
#> 8        Leisure
#> 9           Work
#> 10       Leisure

Answer 3

我们可以使用tidyverse

library(tibble)
library(purrr)
library(dplyr)
enframe(mylist, value = 'Activity') %>%
      unnest(c(Activity)) %>%
      inner_join(df)

-输出

# A tibble: 10 x 2
#   name          Activity         
#   <chr>         <chr>            
# 1 Tech_activity Checking emails  
# 2 Tech_activity Watching YouTube 
# 3 Tech_activity Watching TV      
# 4 Tech_activity Watching TV      
# 5 Work          Working/ studying
# 6 Leisure       Getting ready    
# 7 Leisure       Playing games    
# 8 Leisure       Reading          
# 9 Leisure       Relaxing         
#10 Leisure       Relaxing

使用 R 中的列表和列表名称的值替换数据框列的值

replacing values of a dataframe column using values of a list and list name in R

r

list

data-cleaning