如何计算 Python 中字符串内的不等式 2
How to Evaluate Inequality inside a String in Python 2
我有一个文本文件(我无法更改的不同进程的输出),其中包含存储为字符串的逻辑比较(只有这三个:>, <=, in
)。假设这是我文件中的一行,应该对其进行评估:
myStr = "x>2 and y<=30 and z in ('def', 'abc')"
我的一些变量是分类变量,我指定它们,其余变量是数字:
categoricalVars = ('z')
我的变量值存储在字典中,我们假设这些是它们的值。请注意,它们总是以字符串形式出现,即使是数字变量也是如此:
x, y, z = '5', '6', 'abc'
所以我的问题是如何安全地评估(即不使用 eval()
)关于最后一行的 myStr
的真实性。
我所做的是:首先更改 myStr
以反映数据类型:
import re
delim = "(\>|\<=|\ in )" # Put in group to find later which delimiter is used
def pyRules(s):
varName = re.split(delim, s)[0]
rest = "".join(re.split(delim, s)[1:])
if varName in categoricalVars:
return varName + rest
else:
return "float(" + varName + ")" + rest
# Call:
[pyRules(e) for e in myStr.split(' and ')]
# Result:
['float(x)>2', 'float(y)<=30', "z in ('def', 'abc')"]
现在我可以轻松做到:
[eval(pyRules(e)) for e in myStr.split(' and ')]
# Result:
[True, True, True]
但我想避免这种情况。我试过 ast.literal_eval()
但出现以下错误:
import ast
[ast.literal_eval(pyRules(e)) for e in myStr.split(' and ')]
# Result:
Traceback (most recent call last):
File "<ipython-input-556-dae16951de03>", line 1, in <module>
ast.literal_eval(ast.parse(conds[0]))
File "C:\ProgramData\Anaconda2\lib\ast.py", line 80, in literal_eval
return _convert(node_or_string)
File "C:\ProgramData\Anaconda2\lib\ast.py", line 79, in _convert
raise ValueError('malformed string')
ValueError: malformed string
接下来,我尝试了以下方法,几乎给了我正确答案:
def pyRules(s):
varName = re.split(delim, s)[0]
operation = "".join(re.split(delim, s)[1:])
if varName in categoricalVars:
return "'{" + varName + "}'" + operation
else:
return "float({" + varName + "})" + operation
rules = [pyRules(e).format(x='5',y='6',z='abc') for e in myStr.split(' and ')]
# rules is:
['float(5)>2', 'float(6)<=30', "'abc' in ('def', 'abc')"]
我可以再次使用 eval()
并得到 [True, True, True]
但为了避免它,我定义了我自己的不等式检查器函数:
def check(x):
first, operation, second = re.split(delim, x)
if operation == ">":
return first > second
elif operation == "<=":
return first <= second
elif operation == " in ":
return first in second
# Call:
[check(pyRules(e).format(x='5',y='6',z='abc')) for e in myStr.split(' and ')]
# Result:
[True, False, True]
很难评估第二项,即:'float(6)<=30'
我还使用每个 this SO thread 的 operator
模块重新创建了这个函数,这本质上是同一件事,并且得到了相同的结果。
我检查了 pyparsing, couldn't get it to work (which even looks scary, look at this!), and SymPy,但不幸的是它也经常使用 eval
,如我提供的超链接中所述。
问题 2: 可以使用 eval
吗,因为我 100% 确定我没有任何可以干扰 os
并擦除我的磁盘和其他类似的疯狂东西?
注意:这是我在 Python 2 中构建的一段大代码,因此基于 Python 2 的答案将是理想的;但如果有人认为我的答案在那个领域,我可以转到 Python 3。
经过几个小时的工作,我找到了让 ast.literal_eval()
工作的方法!我的逻辑是查看 x>2
的两侧,即 x
和 2
,通过使用 literal_eval
评估两者来确保它们是安全的,然后 运行 它通过我的 check()
函数进行评估。 z in ('def', 'abc')
相同:首先确保 z
和 ('def', 'abc')
都是安全的,然后使用 check()
函数进行实际的布尔检查。
因为我完全信任我的输入,所以我本可以采用更简单的 eval()
方式,但我只是想加倍谨慎。 并希望为存在安全问题(用户输入等)并需要安全评估逻辑的每个人构建一些代码。希望它对某人有所帮助!
请在下面查看我的完整代码,欢迎任何 comments/recommendations。
import re
import ast
myStr = "x>2 and y<=30 and z in ('def', 'abc')"
categoricalVars = ('z')
x, y, z = '5', '6', 'abc'
delim = "(\>|\<=|\ in )" # Put in group to find in the func check() which delimiter is used
def pyRules(s):
"""
Place {} around variable names so that we can str.format() in the func check()
"""
varName = re.split(delim, s)[0]
rest = "".join(re.split(delim, s)[1:])
return "'{" + varName + "}'" + rest
def check(x):
"""
If operation is > or <= then it is a numeric var, use double literal_eval to
parse floats e.g. "'5'" (dual quotes) to 5.0. This is equivalent to:
float(ast.literal_eval(first)). Else it is categorical, just literal_eval once
"""
first, operation, second = re.split(delim, x)
if operation == ">":
return ast.literal_eval(ast.literal_eval(first)) > ast.literal_eval(second)
elif operation == "<=":
return ast.literal_eval(ast.literal_eval(first)) <= ast.literal_eval(second)
elif operation == " in ":
return ast.literal_eval(first) in ast.literal_eval(second)
# These are my raw rules:
print [pyRules(e) for e in myStr.split(' and ')]
# These are my processed rules:
print [pyRules(e).format(x='5',y='6',z='abc') for e in myStr.split(' and ')]
# And these are my final results of logical evaluation:
print [check(pyRules(e).format(x='5',y='6',z='abc')) for e in myStr.split(' and ')]
三个结果行的结果:
["'{x}'>2", "'{y}'<=30", "'{z}' in ('def', 'abc')"]
["'5'>2", "'6'<=30", "'abc' in ('def', 'abc')"]
[True, True, True]
谢谢!
我有一个文本文件(我无法更改的不同进程的输出),其中包含存储为字符串的逻辑比较(只有这三个:>, <=, in
)。假设这是我文件中的一行,应该对其进行评估:
myStr = "x>2 and y<=30 and z in ('def', 'abc')"
我的一些变量是分类变量,我指定它们,其余变量是数字:
categoricalVars = ('z')
我的变量值存储在字典中,我们假设这些是它们的值。请注意,它们总是以字符串形式出现,即使是数字变量也是如此:
x, y, z = '5', '6', 'abc'
所以我的问题是如何安全地评估(即不使用 eval()
)关于最后一行的 myStr
的真实性。
我所做的是:首先更改 myStr
以反映数据类型:
import re
delim = "(\>|\<=|\ in )" # Put in group to find later which delimiter is used
def pyRules(s):
varName = re.split(delim, s)[0]
rest = "".join(re.split(delim, s)[1:])
if varName in categoricalVars:
return varName + rest
else:
return "float(" + varName + ")" + rest
# Call:
[pyRules(e) for e in myStr.split(' and ')]
# Result:
['float(x)>2', 'float(y)<=30', "z in ('def', 'abc')"]
现在我可以轻松做到:
[eval(pyRules(e)) for e in myStr.split(' and ')]
# Result:
[True, True, True]
但我想避免这种情况。我试过 ast.literal_eval()
但出现以下错误:
import ast
[ast.literal_eval(pyRules(e)) for e in myStr.split(' and ')]
# Result:
Traceback (most recent call last):
File "<ipython-input-556-dae16951de03>", line 1, in <module>
ast.literal_eval(ast.parse(conds[0]))
File "C:\ProgramData\Anaconda2\lib\ast.py", line 80, in literal_eval
return _convert(node_or_string)
File "C:\ProgramData\Anaconda2\lib\ast.py", line 79, in _convert
raise ValueError('malformed string')
ValueError: malformed string
接下来,我尝试了以下方法,几乎给了我正确答案:
def pyRules(s):
varName = re.split(delim, s)[0]
operation = "".join(re.split(delim, s)[1:])
if varName in categoricalVars:
return "'{" + varName + "}'" + operation
else:
return "float({" + varName + "})" + operation
rules = [pyRules(e).format(x='5',y='6',z='abc') for e in myStr.split(' and ')]
# rules is:
['float(5)>2', 'float(6)<=30', "'abc' in ('def', 'abc')"]
我可以再次使用 eval()
并得到 [True, True, True]
但为了避免它,我定义了我自己的不等式检查器函数:
def check(x):
first, operation, second = re.split(delim, x)
if operation == ">":
return first > second
elif operation == "<=":
return first <= second
elif operation == " in ":
return first in second
# Call:
[check(pyRules(e).format(x='5',y='6',z='abc')) for e in myStr.split(' and ')]
# Result:
[True, False, True]
很难评估第二项,即:'float(6)<=30'
我还使用每个 this SO thread 的 operator
模块重新创建了这个函数,这本质上是同一件事,并且得到了相同的结果。
我检查了 pyparsing, couldn't get it to work (which even looks scary, look at this!), and SymPy,但不幸的是它也经常使用 eval
,如我提供的超链接中所述。
问题 2: 可以使用 eval
吗,因为我 100% 确定我没有任何可以干扰 os
并擦除我的磁盘和其他类似的疯狂东西?
注意:这是我在 Python 2 中构建的一段大代码,因此基于 Python 2 的答案将是理想的;但如果有人认为我的答案在那个领域,我可以转到 Python 3。
经过几个小时的工作,我找到了让 ast.literal_eval()
工作的方法!我的逻辑是查看 x>2
的两侧,即 x
和 2
,通过使用 literal_eval
评估两者来确保它们是安全的,然后 运行 它通过我的 check()
函数进行评估。 z in ('def', 'abc')
相同:首先确保 z
和 ('def', 'abc')
都是安全的,然后使用 check()
函数进行实际的布尔检查。
因为我完全信任我的输入,所以我本可以采用更简单的 eval()
方式,但我只是想加倍谨慎。 并希望为存在安全问题(用户输入等)并需要安全评估逻辑的每个人构建一些代码。希望它对某人有所帮助!
请在下面查看我的完整代码,欢迎任何 comments/recommendations。
import re
import ast
myStr = "x>2 and y<=30 and z in ('def', 'abc')"
categoricalVars = ('z')
x, y, z = '5', '6', 'abc'
delim = "(\>|\<=|\ in )" # Put in group to find in the func check() which delimiter is used
def pyRules(s):
"""
Place {} around variable names so that we can str.format() in the func check()
"""
varName = re.split(delim, s)[0]
rest = "".join(re.split(delim, s)[1:])
return "'{" + varName + "}'" + rest
def check(x):
"""
If operation is > or <= then it is a numeric var, use double literal_eval to
parse floats e.g. "'5'" (dual quotes) to 5.0. This is equivalent to:
float(ast.literal_eval(first)). Else it is categorical, just literal_eval once
"""
first, operation, second = re.split(delim, x)
if operation == ">":
return ast.literal_eval(ast.literal_eval(first)) > ast.literal_eval(second)
elif operation == "<=":
return ast.literal_eval(ast.literal_eval(first)) <= ast.literal_eval(second)
elif operation == " in ":
return ast.literal_eval(first) in ast.literal_eval(second)
# These are my raw rules:
print [pyRules(e) for e in myStr.split(' and ')]
# These are my processed rules:
print [pyRules(e).format(x='5',y='6',z='abc') for e in myStr.split(' and ')]
# And these are my final results of logical evaluation:
print [check(pyRules(e).format(x='5',y='6',z='abc')) for e in myStr.split(' and ')]
三个结果行的结果:
["'{x}'>2", "'{y}'<=30", "'{z}' in ('def', 'abc')"]
["'5'>2", "'6'<=30", "'abc' in ('def', 'abc')"]
[True, True, True]
谢谢!