在csv文件的多列中搜索获取请求参数

Search get request parameter in multiple columns of a csv file

我有这个烧瓶 API,用户可以在其中使用他们输入的名称执行获取请求。问题是,我希望能够在两个不同的列中搜索该名称,但我不确定该怎么做,因为这不起作用,因为烧瓶说 'cannot index with multidimensional key':

data = self.data.loc[self.data[['name-english','name_greek']] == name_cap].to_dict()

这就是我要说的部分:

class Search(Resource):
   def __init__(self):
       self.data = pd.read_csv('datacsv')

   def get(self, name):
       name_cap = name.capitalize()
       data = self.data.loc[self.data['name-english'] == name_cap].to_dict()
       # return data found in csv
       return jsonify({'message': data})

所以我想在这两列中搜索,而不是只在一列中搜索。

似乎您的 pandasDataframe 语法有问题,而不是 Flask 本身。您可能从 pandas:

收到此错误

ValueError: cannot index with multidimensional key

根据pandas documentation:

.loc[] is primarily label based, but may also be used with a boolean array.

Allowed inputs are:

  • A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index, and never as an integer position along the index).

  • A list or array of labels, e.g. ['a', 'b', 'c'].

  • A slice object with labels, e.g. 'a':'f'.

  • A boolean array of the same length as the axis being sliced, e.g. [True, False, True].

  • An alignable boolean Series. The index of the key will be aligned before masking.

  • An alignable Index. The Index of the returned selection will be the input.

  • A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)

在您的示例中,您将 self.data[['name-english','name_greek']] == name_cap 作为 loc 的参数,这将 return 另一个数据帧,而不是 True 和 False 数组或布尔系列。

要根据多列过滤数据框,您可以使用按位运算符(例如 & 和 |):

df.loc[(df["A"] == 1) | (df["B"] == 1)]

或者使用实现的方法isin():

Whether each element in the DataFrame is contained in values.

Returns: DataFrame DataFrame of booleans showing whether each element in the DataFrame is contained in values.

any()一起:

Return whether any element is True, potentially over an axis.

Returns: Series or DataFrame If level is specified, then, DataFrame is returned; otherwise, Series is returned.

这样您就可以将布尔系列作为参数传递给您 .loc,例如:

df.loc[ df.isin([1]).any(1)]

此外,总是帮助我处理数据帧的东西是首先使用 jupyter 测试一些东西,我认为它更快,你可以在数据帧中进行更多操作以发现新的方法来做你需要的事情。