使用 nltk 从文件 excel 序列项 0 中的数组数据进行标记时出错：预期的 str 实例，已找到列表

Question

我在这段代码中遇到问题，也许有人帮忙，excel 中文本中的数据序列['hadis']，显示成功

train['hadis'] = train['hadis'].apply(lambda x: " ".join([nltk.tokenize.word_tokenize(x) for x in x.split()]))
train['hadis'].head()

TypeError: sequence item 0: expected str instance, list found

对每行数据进行分词的结果

Answer 1

而不是

lambda x: " ".join([nltk.tokenize.word_tokenize(x) for x in x.split()])

使用

lambda x: " ".join(nltk.tokenize.word_tokenize(x))

error tokenizing with nltk from array data in file excel sequence item 0: expected str instance, list found