如何检查列表中是否有输入

Question

我正在尝试构建一个基于电影数据集的建议工具。更具体地说，它将根据流派关键字按标题推荐电影。

但是我无法通过脚本的 loop/check 部分，这是我尝试过的：

import nltk
import pandas as pd
from nltk.tokenize import word_tokenize
import random

#CSV READ & GENRE-TITLE
data = pd.read_csv("data.csv")

df_title = data["title"]
df_genre = data["genre"]

#TOKENIZE
tokenized_genre = [word_tokenize(i) for i in df_genre]

choice = {}

while choice != "exit":
    choice = input("Please enter a word = ")
    for word in {choice}:
        if word in df_genre:
            """The random title of the random adventure movie will be implemented here"""  
        else:
            print("The movie of the genre doesn't exist")

tokenized_genre 的输出是这样的：

[['Biography', ',', 'Crime', ',', 'Drama'],
 ['Drama'], ['Drama', ',', 'History'],
 ['Adventure', ',', 'Drama', ',', 'Fantasy'],
 ['Biography', ',', 'Drama'],
 ['Biography', ',', 'Drama', ',', 'Romance']

循环的输出：

Please enter a word = adventure
The movie of the genre doesn't exist
Please enter a word = Adventure
The movie of the genre doesn't exist

我猜分词列表中的错误，但我无法解决。

Answer 1

也许我错了，我不是Python高手

df_genre returns "list of list" 似乎不是列表。您应该加入列表，然后在那里搜索。

import itertools

df_genre = [['Biography', ',', 'Crime', ',', 'Drama'], ['Drama'], ['Drama', ',', 'History'], ['Adventure', ',', 'Drama', ',', 'Fantasy'], ['Biography', ',', 'Drama'], ['Biography', ',', 'Drama', ',', 'Romance']]

#TOKENIZE
joined_list = list(itertools.chain.from_iterable(df_genre))

choice = {}

while choice != "exit":
    choice = input("Please enter a word = ")
    for word in {choice}:
        if word in joined_list:
            """The random title of the random adventure movie will be implemented here"""  
            print("Works!")
        else:
            print("The movie of the genre doesn't exist")

Result of local test

不知道是不是你要找的。希望能帮助到你。

Answer 2

您可以使用：

search = {e.lower() for l in tokenized_genre  for e in l}
choice = input("Please enter a word = ")
while choice != "exit":
    if choice.lower() in search:
        # TODO: The random title of the random adventure movie will be implemented here
         print("Works!")  
    else:
        print("The movie of the genre doesn't exist")
    choice = input("Please enter a word = ")

search 是一个仅包含 tokenized_genre 中所有单词的集合，好处是集合中的搜索时间复杂度为 O(1)，因为您的 choice变量是 word 您可以直接检查输入的单词是否在 search 集合

中

如何检查列表中是否有输入

How to check if input in the list

python

nltk

pandas