如何将嵌套 for 循环的输出转换为 Python 中的列表？

Question

我是 Python 的新手，很抱歉回答这个基本问题。我正在尝试将关键字列与文本列表相匹配。如果可以在文本中找到关键字，则应将这些关键字附加到当前以 'Engagement' 列结尾的电子表格中。

我目前在 'for-loop' 的第二行收到以下错误消息：TypeError: 'in ' 需要字符串作为左操作数，而不是浮点数

我的代码有什么问题，我应该如何更正它？谢谢。

df_rawdata = pd.read_excel (r'test.xlsx', sheet_name ='rawdata')
my_rawdatalist = df_rawdata['Text'].tolist()


df_all_words = pd.read_excel (r'test.xlsx', sheet_name ='pet_dict')

keywords_list = set(df_all_words['Animals'].tolist()+df_all_words['Cities'].tolist())

matchlist = []

for rawdata in my_rawdatalist:
        matches = [keyword for keyword in keywords_list if keyword in rawdata]
        matchlist.append("|".join(matches))

print(matchlist)

Answer 1

我真的不明白你为什么要在那里有一个空字符串，但也许这对你有帮助：

Answer 2

我认为列表理解可能会大大简化此过程。请注意，它还允许您处理包含多个关键字的短语：

my_rawdatalist = [
    "The cat is out",
    "The zoo is fun",
    "The dog is tired",
    "The dog chases the cat"
]
keywords_list = ["cat", "dog", "NaN"]
matchlist = []

for rawdata in my_rawdatalist:
    matches = [keyword for keyword in keywords_list if keyword in rawdata]
    matchlist.append("|".join(matches))

print(matchlist)

会给你：

['cat', '', 'dog', 'cat|dog']

如果您有“很多”关键字，那么您可以将 keyword_list 转换为 set()，因为这将有助于提高查找效率。

keywords_list = set(["cat", "dog", "NaN"])

如果您有多列关键字（如果我理解您在说什么），那么我会将每一列附加到集合中。

keywords_list = set(
    ["cat", "dog", "NaN"] ## keywords from column A
    + ["Person", "Woman", "Man", "Camera", "TV"] ## keywords from column B
)

代码应该继续工作：

my_rawdatalist = [
    "The cat is out",
    "The zoo is fun",
    "The dog is tired",
    "The dog chases the cat on TV"
]

keywords_list = set(
    ["cat", "dog", "NaN"] ## keywords from column A
    + ["Person", "Woman", "Man", "Camera", "TV"] ## keywords from column B
)

matchlist = []

for rawdata in my_rawdatalist:
    matches = [keyword for keyword in keywords_list if keyword in rawdata]
    matchlist.append("|".join(matches))

print(matchlist)

给你：

['cat', '', 'dog', 'dog|cat|TV']

如何将嵌套 for 循环的输出转换为 Python 中的列表？

How do convert the output of a nested for loop into a list in Python?

nested-loops

python-3.x