从 C++ 调用 Python 或 Lua 来评估表达式，仅在需要时计算未知变量

Question

我有一个expression/formula这样的

 std::string expr="((A>0) && (B>5 || C > 10))";

我做了一些研究，如果 A、B、C 值已知，通过在 C++ 程序中嵌入 Lua 或 Python，似乎有 eval 函数可以替代 A、B 和 C 和 return true 或 false.

但是当我不知道所有值时会发生什么？假设 A 已知且为 -1。如果 A 为 -1，则无论 B 或 C 的值如何，公式的计算结果都将是 "false"。

我可以在事先不知道所有变量的情况下评估公式吗？例如，如果 A 为 10，查找 B 的值并再次重新计算是有意义的。我们如何解决这些问题？想法？

Answer 1

所以根据我对你问题的理解，你想要这样的东西

if (A>0) {
  B = getB();
  C = getC();
  if (B>23 || C==11)
    explode();
}

也就是说，您的表达式必须分开，这样您就只能使用已知值。

Answer 2

一种方法是将表达式解析为树并计算树。所有变量都已知的子表达式将被完全评估。效果将是简化树。

在您的示例中，树的顶部有 &&，有两个子树，左边的子树是 A>0 的树。为了评估这棵树，我们评估了左子树，它是 returns -1，所以我们不需要评估右子树，因为运算符是 &&。整棵树的计算结果为 false.

Answer 3

你可以这样做：

class LazyValues():

    def __init__(self):
        self._known_values = {}

    def __getitem__(self, var):
        try:
            return self._known_values[var]
        except KeyError:
            print("Evaluating %s..." % var)
            return self._known_values.setdefault(var, eval(var))


def lazy_eval(expr, lazy_vars):
    for var in lazy_vars:
        expr  = expr.replace(var, "lazy_values['%s']" % var)
        # will look like ((lazy_value['A']>0) && (lazy_value['B']>5 || lazy_value['C'] > 10))

    lazy_values = LazyValues()
    return eval(expr)


lazy_eval("((A>0) and (B>5 or C > 10))", lazy_vars=['A', 'B', 'C'])

# Evaluating A...
# ....
# NameError: name 'A' is not defined

A = -1
lazy_eval("((A>0) and (B>5 or C > 10))", lazy_vars=['A', 'B', 'C'])

#Evaluating A...
#False

A = 5
B = 6
lazy_eval("((A>0) and (B>5 or C > 10))", lazy_vars=['A', 'B', 'C'])

# Evaluating A...
# Evaluating B...
# True

更多详情稍后...

Answer 4

虽然这是您的解决方案的一个非常粗略的实现，但它非常适合您的情况，尽管使用了很多 if else 和异常处理。

def main_func():
    def checker(a, b=None, c=None):
        if a is None:
            del a
        if b is None:
            del b
        if c is None:
            del c
        d = eval('a>0 and (b>5 or c>10)')
        return d
    return checker

def doer(a=None, b=None, c=None):
    try:
        return main_func()(a,b,c)
    except NameError as e:
        if "'a' is not" in str(e):
            return 'a'
        elif "'b' is not" in str(e):
            return 'b'
        elif "'c' is not" in str(e):
            return 'c'

def check_ret(ret):
    return type(ret) == bool

def actual_evaluator():
    getter = {
        "a": get_a,
        "b": get_b,
        "c": get_c
    }
    args = []
    while True:
        ret = doer(*tuple(args))
        if not check_ret(ret):
            val = getter[ret]()
            args.append(val)
        else:
            return ret

if __name__ == '__main__':
        print actual_evaluator()

现在解释我的代码，main_func returns 另一个用于计算字符串中给定表达式的函数。虽然此处字符串已被硬编码，但您始终可以将其作为参数传递给函数，并将 eval 中的字符串替换为参数。

在doer中，调用main_func返回的函数，如果抛出NameError，这发生在先前条件为假且要计算新值的情况下然后它 returns 需要计算的特定变量。所有这些都在 actual_evaluator 中检查，其中变量的值是通过某些函数 get_variable_name 获取的，您可以在 getter 字典中定义该函数。在我的代码中，我使用随机数来检查有效性，但就像你说的那样，你必须通过其他方式评估各种变量，这样你才能调用相应的函数。

Answer 5

在我看来答案是肯定的，是的你可以尝试评估缺少信息的表达式。您将需要定义当符号查找失败时会发生什么。

在您的情况下，您将需要一个布尔表达式求值器和一个符号 table，以便求值器可以查找符号来执行表达式。

如果您成功查找所有符号，则结果为真或假。如果您未能查找符号，则处理该情况，可能返回 None、nullptr，或引发/抛出异常。

我相信你可以在你的c++程序中嵌入python解释器并调用一个函数来计算表达式，更重要的是，你可以给它一个字典作为符号table。如果调用 returns 结果，它能够找到足够的符号或结果的快捷方式，否则它会引发异常，您的 C++ 代码可以检测到该异常。

您可以在 python 中制作函数原型，评估该方法是否按您想要的方式工作，然后嵌入。

或者您可以使用语法、词法分析器、解析器和电梯在 C++ 中全部完成。

Answer 6

由于短路行为，Python 可以计算一个表达式，即使没有定义所有包含的值（如果可能的话）。如果不是，它会引发异常：

In [1]: a= False

In [3]: a and b
Out[3]: False

In [4]: a or b
NameError: name 'b' is not defined

但是表达式是从左到右求值的：

In [5]: b and a
NameError: name 'b' is not defined

Answer 7

我不知道有任何现有的库可以处理这个问题。

通常的方法是构建一个表达式树并评估可能的结果——类似于编译器中的常量折叠： https://en.wikipedia.org/wiki/Constant_folding

其中一个重要方面是了解变量的允许值，从而了解允许的部分评估，例如如果 x 是整数或有限浮点数，则 x*0（和 0*x）是 0，但如果 x 是 IEEE 浮点数，则无法计算（因为它可能是 Nan 或无穷大），或者如果 x 可能是矩阵，因为 [1,1]*0 是 [0,0] 而不是标量 0.

Answer 8

我过去曾为此做过 "roll-my-own" 方法。简单的事情并不难；您只需创建自己的对象来实现神奇的数学方法并跟踪其他对象。

如果您需要更全面的功能，sympy 项目旨在进行符号数学...

Answer 9

我不明白你到底想做什么或明白什么，但我同意 ivan_pozdeev 关于 短路评估 和 懒惰评价。

一个布尔表达式从左到右计算，当结果已知时计算停止并忽略右边的内容。

与Python:

E = "(A > 0) and (B > 5 or C > 10)"
A = -1
print(eval(E))

给予

False

但是

E = "(A > 0) and (B > 5 or C > 10)"
A = 1
print(eval(E))

给出错误“name 'B' 未定义”。

Answer 10

我会看看 sympy 或其他计算机代数系统。我相信 pxeression 的代数简化加上短路评估将使您能够评估所有可能获得结果的情况。在某些情况下，您需要知道某个变量的值。例如，如果你有一个像 a == b 这样的简单表达式，如果不知道 a 和 b 的值，你将无法取得任何进展。然而，像 (a >= 0) ||(a <= 0) 这样的东西，代数简化将导致 true 假设 a 不是 NAN 或其他不等于自身的值。

Answer 11

听起来你有两个挑战：

计算某些变量值的成本很高，因此您希望避免计算表达式求值时不需要的值；和
您的表达式以字符串形式存在，在运行时组合，因此您不能使用 C++ 的内置短路逻辑。

这意味着您需要某种方法来在运行时计算表达式，并且您希望尽可能利用短路逻辑。 Python 可能是一个不错的选择，如下例所示。

有一个简短的 Python 脚本 (evaluate.py)，它定义了一个可以从 C 或 C++ 程序中调用的 evaluate() 函数。 evaluate() 函数将尝试计算您给它的表达式（如果需要，将“&&”和“||”翻译成 "and" 和 "or"）。如果它需要一个尚未定义的变量，它将通过调用 C/C++ 程序中定义的 get_var_value() 函数来检索该变量的值（然后缓存该值供以后使用).

此方法将使用正常的短路行为，因此它只会请求完成表达式计算所需的变量值。请注意，这不会重新排列表达式以选择评估它所需的最小变量集；它只是使用标准的短路行为。

更新：我在末尾添加了一个示例，该示例使用 .cpp 文件中的多行字符串文字定义了 Python 脚本。如果您不想与可执行文件一起安装单独的 evaluate.py 文件，这可能很有用。它还稍微简化了 Python 初始化。

以下脚本中的 C/Python 交互基于 https://docs.python.org/2/extending/embedding.html and https://docs.python.org/2/c-api/arg.html 中的代码。

文件如下：

evaluate.py（Python 脚本）

# load embedded_methods module defined by the parent C program
from embedded_methods import get_var_value

# define a custom dictionary class that calls get_var_value(key) for any missing keys.
class var_dict(dict):
    def __missing__(self, var):
        self[var] = val = get_var_value(var)
        return val

# define a function which can be called by the parent C program
def evaluate(expr):
    # Create a dictionary to use as a namespace for the evaluation (this version 
    # will automatically request missing variables).
    # Move this line up to the module level to retain values between calls.
    namespace = var_dict()

    # convert C-style Boolean operators to Python-style
    py_expr = expr.replace("||", " or ").replace("&&", " and ").replace("  ", " ")

    print('evaluating expression "{}" as "{}"'.format(expr, py_expr))

    # evaluate the expression, retrieving variable values as needed
    return eval(py_expr, namespace)

evaluate.c（你的主程序；也可以是evaluate.cpp，用g++编译）

// on Mac, compile with gcc -o evaluate evaluate.c -framework Python
#include <Python/Python.h>  // Mac
// #include <Python.h> // non-Mac?

// retain values of argc and argv for equation evaluation
int argc;
char **argv;

/* 
   Calculate the value of a named variable; this is called from the Python 
   script to obtain any values needed to evaluate the expression. 
*/
static PyObject* c_get_var_value(PyObject *self, PyObject *args)
{
    int var_num;
    char *var_name;
    char err_string[100];
    long var_value;
    if(!PyArg_ParseTuple(args, "s:get_var_value", &var_name)) {
        PyErr_SetString(PyExc_ValueError, "Invalid arguments passed to get_var_value()");
        return NULL;
    }
    // change the code below to define your variable values
    // This version just assumes A, B, C are given by argv[2], argv[3], argv[4], etc.
    printf("looking up value of %s: ", var_name);
    var_num = var_name[0]-'A';
    if (strlen(var_name) != 1 || var_num < 0 || var_num >= argc-2) {
        printf("%s\n", "unknown");
        snprintf(
            err_string, sizeof(err_string), 
            "Value requested for unknown variable \"%s\"", var_name
        );
        PyErr_SetString(PyExc_ValueError, err_string);
        return NULL;  // will raise exception in Python
    } else {
        var_value = atoi(argv[2+var_num]);
        printf("%ld\n", var_value);
        return Py_BuildValue("l", var_value);
    }
}

// list of methods to be added to the "embedded_methods" module
static PyMethodDef c_methods[] = {
    {"get_var_value", c_get_var_value, METH_VARARGS, // could use METH_O
     "Retrieve the value for the specified variable."},
    {NULL, NULL, 0, NULL} // sentinel for end of list
};

int main(int ac, char *av[])
{
    PyObject *p_module, *p_evaluate, *p_args, *p_result;
    long result;
    const char* expr;

    // cache and evaluate arguments
    argc = ac;
    argv = av;
    if (argc < 2) {
        fprintf(
            stderr, 
            "Usage: %s \"expr\" A B C ...\n"
            "e.g.,  %s \"((A>0) && (B>5 || C > 10))\" 10 9 -1\n", 
            argv[0], argv[0]
        );
        return 1;
    }
    expr = argv[1];

    // initialize Python
    Py_SetProgramName(argv[0]);
    Py_Initialize();
    // Set system path to include the directory where this executable is stored
    // (to find evaluate.py later)
    PySys_SetArgv(argc, argv);

    // attach custom module with get_var_value() function
    Py_InitModule("embedded_methods", c_methods);

    // Load evaluate.py
    p_module = PyImport_ImportModule("evaluate");
    if (PyErr_Occurred()) { PyErr_Print(); }
    if (p_module == NULL) {
        fprintf(stderr, "unable to load evaluate.py\n");
        return 1;
    }

    // get a reference to the evaluate() function
    p_evaluate = PyObject_GetAttrString(p_module, "evaluate");
    if (!(p_evaluate && PyCallable_Check(p_evaluate))) {
        fprintf(stderr, "Cannot retrieve evaluate() function from evaluate.py module\n");
        return 1;
    }

     /*
        Call the Python evaluate() function with the expression to be evaluated.
        The evaluate() function will call c_get_var_value() to obtain any
        variable values needed to evaluate the expression. It will use 
        caching and normal logical short-circuiting to reduce the number 
        of requests.
     */
    p_args = Py_BuildValue("(s)", expr);
    p_result = PyObject_CallObject(p_evaluate, p_args);
    Py_DECREF(p_args);
    if (PyErr_Occurred()) {
        PyErr_Print();
        return 1;
    }
    result = PyInt_AsLong(p_result);
    Py_DECREF(p_result);

    printf("result was %ld\n", result);

    Py_DECREF(p_evaluate);
    Py_DECREF(p_module);
    return 0;
}

结果：

$ evaluate "((A>0) && (B>5 || C > 10))" -1 9 -1
evaluating expression "((A>0) && (B>5 || C > 10))" as "((A>0) and (B>5 or C > 10))"
looking up value of A: -1
result was 0

$ evaluate "((A>0) && (B>5 || C > 10))" 10 9 -1
evaluating expression "((A>0) && (B>5 || C > 10))" as "((A>0) and (B>5 or C > 10))"
looking up value of A: 10
looking up value of B: 9
result was 1

$ evaluate "((A>0) && (B>5 || C > 10))" 10 3 -1
evaluating expression "((A>0) && (B>5 || C > 10))" as "((A>0) and (B>5 or C > 10))"
looking up value of A: 10
looking up value of B: 3
looking up value of C: -1
result was 0

作为替代方案，您可以将所有这些代码合并到一个 .cpp 文件中，如下所示。这使用了 C++11 中的多行字符串文字功能。

独立 evaluate.cpp

// on Mac, compile with g++ evaluate.cpp -o evaluate -std=c++11 -framework Python
#include <Python/Python.h>  // Mac
//#include <Python.h> // non-Mac?

/* 
   Python script to be run in embedded interpreter.
   This defines an evaluate(expr) function which will interpret an expression
   and return the result. If any variable values are needed, it will call the
   get_var_values(var) function defined in the parent C++ program
*/
const char* py_script = R"(
# load embedded_methods module defined by the parent C program
from embedded_methods import get_var_value

# define a custom dictionary class that calls get_var_value(key) for any missing keys.
class var_dict(dict):
    def __missing__(self, var):
        self[var] = val = get_var_value(var)
        return val

# define a function which can be called by the parent C program
def evaluate(expr):
    # Create a dictionary to use as a namespace for the evaluation (this version 
    # will automatically request missing variables).
    # Move this line up to the module level to retain values between calls.
    namespace = var_dict()

    # convert C-style Boolean operators to Python-style
    py_expr = expr.replace("||", " or ").replace("&&", " and ").replace("  ", " ")

    print('evaluating expression "{}" as "{}"'.format(expr, py_expr))

    # evaluate the expression, retrieving variable values as needed
    return eval(py_expr, namespace)
)";

// retain values of argc and argv for equation evaluation
int argc;
char **argv;

/* 
   Calculate the value of a named variable; this is called from the Python 
   script to obtain any values needed to evaluate the expression. 
*/
static PyObject* c_get_var_value(PyObject *self, PyObject *args)
{
    int var_num;
    char *var_name;
    char err_string[100];
    long var_value;
    if(!PyArg_ParseTuple(args, "s:get_var_value", &var_name)) {
        PyErr_SetString(PyExc_ValueError, "Invalid arguments passed to get_var_value()");
        return NULL;
    }
    // change the code below to define your variable values
    // This version just assumes A, B, C are given by argv[2], argv[3], argv[4], etc.
    printf("looking up value of %s: ", var_name);
    var_num = var_name[0]-'A';
    if (strlen(var_name) != 1 || var_num < 0 || var_num >= argc-2) {
        printf("%s\n", "unknown");
        snprintf(
            err_string, sizeof(err_string), 
            "Value requested for unknown variable \"%s\"", var_name
        );
        PyErr_SetString(PyExc_ValueError, err_string);
        return NULL;  // will raise exception in Python
    } else {
        var_value = atoi(argv[2+var_num]);
        printf("%ld\n", var_value);
        return Py_BuildValue("l", var_value);
    }
}

// list of methods to be added to the "embedded_methods" module
static PyMethodDef c_methods[] = {
    {"get_var_value", c_get_var_value, METH_VARARGS, // could use METH_O
     "Retrieve the value for the specified variable."},
    {NULL, NULL, 0, NULL} // sentinel for end of list
};

int main(int ac, char *av[])
{
    PyObject *p_module, *p_evaluate, *p_args, *p_result;
    long result;
    const char* expr;

    // cache and evaluate arguments
    argc = ac;
    argv = av;
    if (argc < 2) {
        fprintf(
            stderr, 
            "Usage: %s \"expr\" A B C ...\n"
            "e.g.,  %s \"((A>0) && (B>5 || C > 10))\" 10 9 -1\n", 
            argv[0], argv[0]
        );
        return 1;
    }
    expr = argv[1];

    // initialize Python
    Py_SetProgramName(argv[0]);
    Py_Initialize();

    // attach custom module with get_var_value() function
    Py_InitModule("embedded_methods", c_methods);

    // run script to define evalute() function
    PyRun_SimpleString(py_script);
    if (PyErr_Occurred()) {
        PyErr_Print(); 
        fprintf(stderr, "%s\n", "unable to run Python script");
        return 1;
    }

    // get a reference to the Python evaluate() function (can be reused later)
    // (note: PyRun_SimpleString creates objects in the __main__ module)
    p_module = PyImport_AddModule("__main__");
    p_evaluate = PyObject_GetAttrString(p_module, "evaluate");
    if (!(p_evaluate && PyCallable_Check(p_evaluate))) {
        fprintf(stderr, "%s\n", "Cannot retrieve evaluate() function from __main__ module");
        return 1;
    }

    /*
       Call the Python evaluate() function with the expression to be evaluated.
       The evaluate() function will call c_get_var_value() to obtain any
       variable values needed to evaluate the expression. It will use 
       caching and normal logical short-circuiting to reduce the number 
       of requests.
    */
    p_args = Py_BuildValue("(s)", expr);
    p_result = PyObject_CallObject(p_evaluate, p_args);
    Py_DECREF(p_args);
    if (PyErr_Occurred()) {
        PyErr_Print();
        return 1;
    }
    result = PyInt_AsLong(p_result);
    Py_DECREF(p_result);

    printf("result was %ld\n", result);

    Py_DECREF(p_module);
    Py_DECREF(p_evaluate);
    return 0;
}

Answer 12

根据这个问题我假设

你有一个依赖于各种函数结果的逻辑表达式；
您多次使用其中某些函数的值（可能在计算此表达式之前，或可能在此表达式内），因此您想存储它们的结果以避免调用它们两次；和
您想要计算逻辑表达式，并且在此过程中您想要检索和存储以前没有运行的函数的值，但它们只够计算表达式 (使用正常的短路行为）。

我在另一个答案中提到，您最好只使用 C++ 中的内置短路行为。要做到这一点并实现 objective 2，您需要在逻辑表达式中使用函数而不是变量。这样您就可以在表达式需要时触发缺失值的计算。

下面是执行此操作的两种方法。第一个使用通用缓存包装器包装您的慢速函数。第二个为每个慢函数定义一个自定义的缓存助手。编译后，应使用您的 A、B 和 C 值调用其中任何一个进行测试，例如evaluate_cached 10 9 -1。他们都会按照你想要的方式行事。

evaluate_cached.cpp

# include <stdio.h>
# include <stdlib.h>
# include <unordered_map>

static char **args;

// define (slow) functions to calculate each of the needed values
int A() {
  printf("Calculating value for A\n");
  return atoi(args[1]);
}

int B() {
  printf("Calculating value for B\n");
  return atoi(args[2]);
}

int C() {
  printf("Calculating value for C\n");
  return atoi(args[3]);
}

typedef int (*int_func)(void);

// wrapper to cache results of other functions
int cached(int_func func) {
    // Create an unordered_map to hold function results
    static std::unordered_map<int_func, int> results;

    if (results.find(func) == results.end()) {
        // function hasn't been called before; call and cache results
        results[func] = func();
    }
    return results[func];
}

int main(int argc, char *argv[])
{
    if (argc!=4) {
        fprintf(stderr, "%s must be called with 3 values for A, B and C.\n", argv[0]);
        return 1;
    } else {
        args = argv;
    }
    // do the evaluation, with short-circuiting
    if (((cached(A)>0) && (cached(B)>5 || cached(C) > 10))) {
        printf("condition was true\n");
    } else {
        printf("condition was false\n");
    }
    return 0;
}

evaluate_helpers.c

# include <stdio.h>
# include <stdlib.h>

static char **args;

// define (slow) functions to calculate each of the needed values
int calculate_A() {
  printf("Calculating value for A\n");
  return atoi(args[1]);
}

int calculate_B() {
  printf("Calculating value for B\n");
  return atoi(args[2]);
}

int calculate_C() {
  printf("Calculating value for C\n");
  return atoi(args[3]);
}

// define functions to retrieve values as needed,
// with caching to avoid double-calculation
int A() {
  static int val, set=0;
  if (!set) val=calculate_A();
  return val;
}
int B() {
  static int val, set=0;
  if (!set) val=calculate_B();
  return val;
}
int C() {
  static int val, set=0;
  if (!set) val=calculate_B();
  return val;
}

int main(int argc, char *argv[])
{
    if (argc!=4) {
        fprintf(stderr, "%s must be called with 3 values for A, B and C.\n", argv[0]);
        return 1;
    } else {
        args = argv;
    }
    // do the evaluation, with short-circuiting
    if (((A()>0) && (B()>5 || C() > 10))) {
        printf("condition was true\n");
    } else {
        printf("condition was false\n");
    }
    return 0;
}

从 C++ 调用 Python 或 Lua 来评估表达式，仅在需要时计算未知变量

Call Python or Lua from C++ to evaluate an expression, calculating unknown variables only if needed

c++

python

lua

lazy-evaluation