将函数的结果组合在一行中的一列中

Combine the result of the function on a row in one column

我有一个很大的 data.table,其中一列包含文本,这是一个简单的示例:

x = data.table(text = c("This is the first text", "Second text"))

我想要一个 data.table,其中一栏包含所有课文的所有单词。这是我的尝试:

x[, strsplit(text, " ")]
                     text
1: This is the first text
2:            Second text

这导致:

      V1     V2
1:  This Second
2:    is   text
3:   the Second
4: first   text
5:  text Second

我想得到的结果是:

   text
1: This 
2: is
3: the
4: first
5: text
6: Second
7: text

您正在寻找:

data.table(text=unlist(strsplit(x$text, " ")))

#     text
#1:   This
#2:     is
#3:    the
#4:  first
#5:   text
#6: Second
#7:   text

正如@Henrik 在评论中提到的,您可以使用 splitstackshape 包中的 cSplit 来完成此任务:

library(splitstackshape)
cSplit(x, "text", sep = " ", direction = "long")

给出:

#     text
#1:   This
#2:     is
#3:    the
#4:  first
#5:   text
#6: Second
#7:   text

您还可以创建一个列来帮助识别结果中的初始句子:

x %>% dplyr::mutate(n = 1:n()) %>% cSplit(., "text", " ", "long")

给出:

#     text n
#1:   This 1
#2:     is 1
#3:    the 1
#4:  first 1
#5:   text 1
#6: Second 2
#7:   text 2