具有唯一ID的词云
word cloud with unique ID
我有一个包含 2 列的数据集:唯一 ID 和评论。
我可以只用评论形成一个词云,但我希望我可以保留每个文本的唯一 ID,这样当我在 Tableau 中可视化结果时我可以重新加入它。
例如
ID | Text
a1 This is a test comment.
a2 Another test comment.
a3 This is very good
a4 I like this.
我希望的输出是:
ID | Words
--
a1 This
a1 is
a1 a
a1 test
a1 comment
a2 Another
a2 test
a2 comment
a3 This
a3 is
a3 very
a3 good.
希望您收到我的样品。
谢谢
J
你可以这样做
library(tidyverse)
df<- tribble(
~ID, ~Text,
"a1", "This is a test comment.",
"a2", "Another test comment.",
"a3", "This is very good",
"a4", "I like this."
)
split_data <- strsplit(df$Text, " ")
do.call(rbind,
lapply(seq_along(unique(df$ID)), function(x) {
cbind(rep(df$ID[x], length(split_data[[x]])), split_data[[x]])
})
)
> df <- read.table(text='ID Text
+ a1 "This is a test comment"
+ a2 "Another test comment"
+ a3 "This is very good"
+ a4 "I like this"', header=TRUE, as.is=TRUE)
>
>
> library(data.table)
> dt = data.table(df)
> dt[,c(Words=strsplit(Text, " ", fixed = TRUE)), by = ID]
ID Words
1: a1 This
2: a1 is
3: a1 a
4: a1 test
5: a1 comment
6: a2 Another
7: a2 test
8: a2 comment
9: a3 This
10: a3 is
11: a3 very
12: a3 good
13: a4 I
14: a4 like
15: a4 this
我有一个包含 2 列的数据集:唯一 ID 和评论。 我可以只用评论形成一个词云,但我希望我可以保留每个文本的唯一 ID,这样当我在 Tableau 中可视化结果时我可以重新加入它。
例如
ID | Text
a1 This is a test comment.
a2 Another test comment.
a3 This is very good
a4 I like this.
我希望的输出是:
ID | Words
--
a1 This
a1 is
a1 a
a1 test
a1 comment
a2 Another
a2 test
a2 comment
a3 This
a3 is
a3 very
a3 good.
希望您收到我的样品。 谢谢
J
你可以这样做
library(tidyverse)
df<- tribble(
~ID, ~Text,
"a1", "This is a test comment.",
"a2", "Another test comment.",
"a3", "This is very good",
"a4", "I like this."
)
split_data <- strsplit(df$Text, " ")
do.call(rbind,
lapply(seq_along(unique(df$ID)), function(x) {
cbind(rep(df$ID[x], length(split_data[[x]])), split_data[[x]])
})
)
> df <- read.table(text='ID Text
+ a1 "This is a test comment"
+ a2 "Another test comment"
+ a3 "This is very good"
+ a4 "I like this"', header=TRUE, as.is=TRUE)
>
>
> library(data.table)
> dt = data.table(df)
> dt[,c(Words=strsplit(Text, " ", fixed = TRUE)), by = ID]
ID Words
1: a1 This
2: a1 is
3: a1 a
4: a1 test
5: a1 comment
6: a2 Another
7: a2 test
8: a2 comment
9: a3 This
10: a3 is
11: a3 very
12: a3 good
13: a4 I
14: a4 like
15: a4 this