在csv文件的多列中搜索获取请求参数
Search get request parameter in multiple columns of a csv file
我有这个烧瓶 API,用户可以在其中使用他们输入的名称执行获取请求。问题是,我希望能够在两个不同的列中搜索该名称,但我不确定该怎么做,因为这不起作用,因为烧瓶说 'cannot index with multidimensional key':
data = self.data.loc[self.data[['name-english','name_greek']] == name_cap].to_dict()
这就是我要说的部分:
class Search(Resource):
def __init__(self):
self.data = pd.read_csv('datacsv')
def get(self, name):
name_cap = name.capitalize()
data = self.data.loc[self.data['name-english'] == name_cap].to_dict()
# return data found in csv
return jsonify({'message': data})
所以我想在这两列中搜索,而不是只在一列中搜索。
似乎您的 pandasDataframe 语法有问题,而不是 Flask 本身。您可能从 pandas:
收到此错误
ValueError: cannot index with multidimensional key
根据pandas documentation:
.loc[] is primarily label based, but may also be used with a boolean
array.
Allowed inputs are:
A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label
of the index, and never as an integer position along the index).
A list or array of labels, e.g. ['a', 'b', 'c'].
A slice object with labels, e.g. 'a':'f'.
A boolean array of the same length as the axis being sliced, e.g.
[True, False, True].
An alignable boolean Series. The index of the key will be aligned
before masking.
An alignable Index. The Index of the returned selection will be the
input.
A callable function with one argument (the calling Series or
DataFrame) and that returns valid output for indexing (one of the
above)
在您的示例中,您将 self.data[['name-english','name_greek']] == name_cap
作为 loc 的参数,这将 return 另一个数据帧,而不是 True 和 False 数组或布尔系列。
要根据多列过滤数据框,您可以使用按位运算符(例如 & 和 |):
df.loc[(df["A"] == 1) | (df["B"] == 1)]
或者使用实现的方法isin()
:
Whether each element in the DataFrame is contained in values.
Returns: DataFrame
DataFrame of booleans showing whether each element in the DataFrame is contained in values.
与any()
一起:
Return whether any element is True, potentially over an axis.
Returns: Series or DataFrame
If level is specified, then, DataFrame is returned; otherwise, Series is returned.
这样您就可以将布尔系列作为参数传递给您 .loc,例如:
df.loc[ df.isin([1]).any(1)]
此外,总是帮助我处理数据帧的东西是首先使用 jupyter 测试一些东西,我认为它更快,你可以在数据帧中进行更多操作以发现新的方法来做你需要的事情。
我有这个烧瓶 API,用户可以在其中使用他们输入的名称执行获取请求。问题是,我希望能够在两个不同的列中搜索该名称,但我不确定该怎么做,因为这不起作用,因为烧瓶说 'cannot index with multidimensional key':
data = self.data.loc[self.data[['name-english','name_greek']] == name_cap].to_dict()
这就是我要说的部分:
class Search(Resource):
def __init__(self):
self.data = pd.read_csv('datacsv')
def get(self, name):
name_cap = name.capitalize()
data = self.data.loc[self.data['name-english'] == name_cap].to_dict()
# return data found in csv
return jsonify({'message': data})
所以我想在这两列中搜索,而不是只在一列中搜索。
似乎您的 pandasDataframe 语法有问题,而不是 Flask 本身。您可能从 pandas:
收到此错误ValueError: cannot index with multidimensional key
根据pandas documentation:
.loc[] is primarily label based, but may also be used with a boolean array.
Allowed inputs are:
A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index, and never as an integer position along the index).
A list or array of labels, e.g. ['a', 'b', 'c'].
A slice object with labels, e.g. 'a':'f'.
A boolean array of the same length as the axis being sliced, e.g. [True, False, True].
An alignable boolean Series. The index of the key will be aligned before masking.
An alignable Index. The Index of the returned selection will be the input.
A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)
在您的示例中,您将 self.data[['name-english','name_greek']] == name_cap
作为 loc 的参数,这将 return 另一个数据帧,而不是 True 和 False 数组或布尔系列。
要根据多列过滤数据框,您可以使用按位运算符(例如 & 和 |):
df.loc[(df["A"] == 1) | (df["B"] == 1)]
或者使用实现的方法isin()
:
Whether each element in the DataFrame is contained in values.
Returns: DataFrame DataFrame of booleans showing whether each element in the DataFrame is contained in values.
与any()
一起:
Return whether any element is True, potentially over an axis.
Returns: Series or DataFrame If level is specified, then, DataFrame is returned; otherwise, Series is returned.
这样您就可以将布尔系列作为参数传递给您 .loc,例如:
df.loc[ df.isin([1]).any(1)]
此外,总是帮助我处理数据帧的东西是首先使用 jupyter 测试一些东西,我认为它更快,你可以在数据帧中进行更多操作以发现新的方法来做你需要的事情。