Python 中处理夹具数据的正确方法

Question

我的程序正在生成自然语言句子。我想通过将随机种子设置为固定值来正确测试它，然后：

产生预期结果；
将生成的句子与预期结果进行比较；
如果它们不同，询问用户生成的句子是否真的是预期结果，在这种情况下，更新预期结果。

我已经在JS中遇到过这样的系统，所以我很惊讶在Python中没有找到它。你如何处理这种情况？

Answer 1

Python 中有许多测试框架，其中最流行的两个是 PyTest and Nose。 PyTest 倾向于涵盖所有基础，但 Nose 也有很多不错的附加功能。

对于 nose，fixtures 在文档的早期部分进行了介绍。他们给出的例子看起来像

def setup_func():
    "set up test fixtures"

def teardown_func():
    "tear down test fixtures"

@with_setup(setup_func, teardown_func)
def test():
    "test ..."

在您的情况下，通过人工审核，您可能需要将该逻辑直接构建到测试本身中。

使用更具体的示例进行编辑

基于 Nose 的示例，解决此问题的一种方法是编写测试

from nose.tools import eq_

def setup_func():
    "set your random seed"

def teardown_func():
    "whatever teardown you need"

@with_setup(setup_func, teardown_func)
def test():
    expected = "the correct answer"
    actual = "make a prediction ..."
    _eq(expected, actual, "prediction did not match!")

当您运行测试时，如果模型没有产生正确的输出，测试将失败并显示 "prediction did not match!"。在这种情况下，您应该转到您的测试文件并使用预期值更新 expected。此过程不像在运行时键入它那样动态，但它具有易于版本化和控制的优点。

Answer 2

要求用户替换预期答案的一个缺点是自动化测试不能运行自动进行。因此，测试框架不允许读取 input.

我真的很想要这个功能，所以我的 implementation 看起来像：

def compare_results(expected, results):
    if not os.path.isfile(expected):
        logging.warning("The expected file does not exist.")
    elif filecmp.cmp(expected, results):
        logging.debug("%s is accepted." % expected)
        return 
    content = Path(results).read_text()
    print("The test %s failed." % expected)
    print("Should I accept the results?")
    print(content)
    while True:
        try:
            keep = input("[y/n]")
        except OSError:
            assert False, "The test failed. Run directly this file to accept the result"
        if keep.lower() in ["y", "yes"]:
            Path(expected).write_text(content)
            break
        elif keep.lower() in ["n", "no"]:
            assert False, "The test failed and you did not accept the answer."
            break
        else:
            print("Please answer by yes or no.")


def test_example_iot_root(setup):
    ...
    compare_results(EXPECTED_DIR / "iot_root.test", tmp.name)


if __name__ == "__main__":
    from inspect import getmembers, isfunction
    def istest(o):
        return isfunction(o[1]) and  o[0].startswith("test")

    [random.seed(1) and o[1](setup) for o in getmembers(sys.modules[__name__]) \
            if istest(o)]

当我直接运行这个文件时，它问我是否应该替换预期的结果。当我从 pytest 运行时，input 创建一个 OSError 允许退出循环。绝对不完美。

Python 中处理夹具数据的正确方法

Proper way to deal with fixture data in Python

python

fixtures

使用更具体的示例进行编辑