根据术语文档矩阵突出显示 R 字符串列表中的单词
Highlight words in R list of strings based on term document matrix
以下是活动数据的dataframe
Subject Response Rate(%) Campaign Type Channel
Buy Stunning Phone A 81.00 A e-mail
Special Emi OFFER 81.00 B e-mail
Buy Stunning Phone at EMI 73.00 C SMS
The game changer is here. 85.00 A SMS
Buy Stunnig Phone A 80.00 A SMS
Special Emi OFFER 88.00 B e-mail
Buy Stunning Phone at EMI 48.00 C e-mail
The game changer is here. 48.00 A e-mail
Buy Stunning Phone 89.00 A e-mail
Special Emi OFFER 89.00 B SMS
Buy Stunning Phone at EMI 69.00 C SMS
我创建了一个术语文档矩阵如下
Word Frequency
big 10
upgrade 10
worth 10
latest 9
much 9
phone 8
exciting 8
back 7
colours 7
case 6
stylish 6
clear 6
experience 5
time 5
我已经按照响应率降低的顺序对基于 dplyr 的通道类型的数据进行了子集化。
我想针对每个主题突出显示/列出术语文档矩阵中的单词。如果单词出现在主题中,则在主题附近列为单独的列表。我无法找到执行此操作的方法。
你的意思是这样吗
library(dplyr)
df <- read.table(header = TRUE, sep = "," ,text = "Subject,Response Rate(%),Campaign Type,Channel
Buy Stunning Phone A,81.00,A,e-mail
Special Emi OFFER,81.00,B,e-mail
Buy Stunning Phone at EMI,73.00,C,SMS
The game changer is here.,85.00,A,SMS
Buy Stunnig Phone A,80.00,A,SMS
Special Emi OFFER,88.00,B,e-mail
Buy Stunning Phone at EMI,48.00,C,e-mail
The game changer is here.,48.00,A,e-mail
Buy Stunning Phone,89.00,A,e-mail
Special Emi OFFER,89.00,B,SMS
Buy Stunning Phone at EMI,69.00,C,SMS",)
df2 <- read.table(header = TRUE, sep = "," ,text = "Word,Frequency
big,10
upgrade,10
worth,10
latest,9
much,9
phone,8
exciting,8
back,7
colours,7
case,6
stylish,6
clear,6
experience,5
time,5",)
m = sapply(df2$Word %>% as.character() %>% trimws(),regexpr,text = df$Subject %>% as.character(),ignore.case = TRUE)
df$keyWord <- sapply(1:nrow(m),function(idx){
t = m[idx,] > 0 %>% unlist()
paste0(names(t)[t],collapse = ",")
})
df
以下是活动数据的dataframe
Subject Response Rate(%) Campaign Type Channel
Buy Stunning Phone A 81.00 A e-mail
Special Emi OFFER 81.00 B e-mail
Buy Stunning Phone at EMI 73.00 C SMS
The game changer is here. 85.00 A SMS
Buy Stunnig Phone A 80.00 A SMS
Special Emi OFFER 88.00 B e-mail
Buy Stunning Phone at EMI 48.00 C e-mail
The game changer is here. 48.00 A e-mail
Buy Stunning Phone 89.00 A e-mail
Special Emi OFFER 89.00 B SMS
Buy Stunning Phone at EMI 69.00 C SMS
我创建了一个术语文档矩阵如下
Word Frequency
big 10
upgrade 10
worth 10
latest 9
much 9
phone 8
exciting 8
back 7
colours 7
case 6
stylish 6
clear 6
experience 5
time 5
我已经按照响应率降低的顺序对基于 dplyr 的通道类型的数据进行了子集化。 我想针对每个主题突出显示/列出术语文档矩阵中的单词。如果单词出现在主题中,则在主题附近列为单独的列表。我无法找到执行此操作的方法。
你的意思是这样吗
library(dplyr)
df <- read.table(header = TRUE, sep = "," ,text = "Subject,Response Rate(%),Campaign Type,Channel
Buy Stunning Phone A,81.00,A,e-mail
Special Emi OFFER,81.00,B,e-mail
Buy Stunning Phone at EMI,73.00,C,SMS
The game changer is here.,85.00,A,SMS
Buy Stunnig Phone A,80.00,A,SMS
Special Emi OFFER,88.00,B,e-mail
Buy Stunning Phone at EMI,48.00,C,e-mail
The game changer is here.,48.00,A,e-mail
Buy Stunning Phone,89.00,A,e-mail
Special Emi OFFER,89.00,B,SMS
Buy Stunning Phone at EMI,69.00,C,SMS",)
df2 <- read.table(header = TRUE, sep = "," ,text = "Word,Frequency
big,10
upgrade,10
worth,10
latest,9
much,9
phone,8
exciting,8
back,7
colours,7
case,6
stylish,6
clear,6
experience,5
time,5",)
m = sapply(df2$Word %>% as.character() %>% trimws(),regexpr,text = df$Subject %>% as.character(),ignore.case = TRUE)
df$keyWord <- sapply(1:nrow(m),function(idx){
t = m[idx,] > 0 %>% unlist()
paste0(names(t)[t],collapse = ",")
})
df