将运算符作为函数传递给 Pandas 数据框

Question

我正在根据阈值从系列中选择数据。

>>> s = pd.Series(np.random.randn(5))
>>> s
0   -0.308855
1   -0.031073
2    0.872700
3   -0.547615
4    0.633501
dtype: float64
>>> cfg = {'threshold' : 0 , 'op' : 'less' }
>>> ops = {'less' : '<', 'more': '>' , 'equal': '==' , 'not equal' : '!='}
>>> ops[cfg['op']]
'<'
>>> s[s < cfg['threshold']]
0   -0.308855
1   -0.031073
3   -0.547615
dtype: float64

我想在最后一行代码中使用 ops[cfg['op']]，而不是“<”。如果需要，我愿意更改 key 和 ops dict 的值（比如 -lt 而不是 <）。如何做到这一点？

Answer 1

定义一个可以代表您的运算符的方法字典。

import operator    
d = {
         'more'  : operator.gt,
         'less'  : operator.lt,
         'equal' : operator.eq, 
         'not equal' : operator.ne
   }

现在，只需索引您的字典并应用您的函数参数。

m = d[cfg['op']](s, cfg['threshold'])
m

0    False
1     True
2     True
3    False
4    False
dtype: bool

s[m]

1   -0.262054
2   -1.300810
dtype: float64

这里，

d[cfg['op']](s, cfg['threshold'])

被翻译成

operator.lt(s, 0)

Answer 2

我只关心@cᴏʟᴅsᴘᴇᴇᴅ的回答和@Zero的链接问答...
但这是 numexpr

的替代方案

import numexpr as ne

s[ne.evaluate('s {} {}'.format(ops[cfg['op']], cfg['threshold']))]

0   -0.308855
1   -0.031073
3   -0.547615
Name: A, dtype: float64

作为 How to pass an operator to a python function?

的副本关闭后，我重新打开了这个问题

问题和答案都很棒，我投了赞成票表示感谢。

在 pandas.Series 的上下文中提问可以使用包含 numpy 和 numexpr 的答案。而试图用这个答案来回答重复目标纯粹是胡说八道。

将运算符作为函数传递给 Pandas 数据框

Passing operators as functions to use with Pandas data frames

python

conditional

series

dynamic-execution

pandas