在嵌入式 Python 中禁用内置模块导入

Question

我正在我的应用程序中嵌入 Python 3.6，我想在脚本中禁用导入命令以防止用户导入任何 python 内置库。我只想使用语言本身和我自己的 C++ 定义模块。

Py_SetProgramName (L"Example");
Py_Initialize ();
PyObject* mainModule = PyImport_AddModule ("__main__");
PyObject* globals = PyModule_GetDict (mainModule);

// This should work
std::string script1 = "print ('example')";
PyRun_String (script1.c_str (), Py_file_input, globals, nullptr);

// This should not work
std::string script2 = "import random\n"
                      "print (random.randint (1, 10))\n";
PyRun_String (script2.c_str (), Py_file_input, globals, nullptr);

Py_Finalize ();

你知道实现这个的方法吗？

Answer 1

Python 长期以来一直无法创建安全的沙箱（如果您愿意，请参阅 How can I sandbox Python in pure Python? as a starting point, then dive into an old python-dev discussion）。这是我认为最好的两个选择。

Pre-scan代码

执行任何操作之前，先扫码。您可以在 Python 中使用 AST module 执行此操作，然后遍历树，或者通过更简单的文本搜索可能走得足够远。这可能适用于您的场景，因为您的用例有限 - 它不会推广到真正的任意代码。

你在你的案例中寻找的是任何 import 语句（简单）和任何 top-level 变量（例如，在 a.b.c 中你关心 a 并且可能 a.b 对于给定的 a) 而不是 "approved"。这将使您能够在运行之前的任何不干净的代码上失败。

这里的挑战是，即使是经过简单混淆的代码也会绕过您的检查。例如，如果基本扫描 import 找不到其他模块或全局变量，这里有一些导入模块的方法。您可能希望限制直接访问 __builtins__、globals、some/most/all 具有 __double_underscores__ 的名称和某些类型的成员。在 AST 中，这些将不可避免地显示为 top-level 变量读取或属性访问。

getattr(__builtins__, '__imp'+'ort__')('other_module')

globals()['__imp'+'ort__']('other_module')

module.__loader__.__class__(
    "other_module",
    module.__loader__.path + '/../other_module.py'
).load_module()

（我希望它有点不言而喻，这是一个不可能的挑战，以及为什么这种沙盒方法从未完全成功。但它可能足够好，具体取决于您的特定威胁模型。）

运行时审计

如果您能够编译自己的 Python 运行时，您可以考虑使用（当前草稿）PEP 551 hooks. (Disclaimer: I am the author of this PEP.) There are draft implementations against the latest 3.7 and 3.6 版本。

本质上，这可以让您为 Python 内的一系列事件添加挂钩并确定如何响应。例如，您可以监听所有 import 事件并根据正在导入的确切模块在运行时确定是允许还是失败，或者监听 compile 事件来管理 all 运行时编译。您可以从 Python 代码（使用 sys.addaudithook）或 C 代码（使用 PySys_AddAuditHook）执行此操作。

关于此 PEP 的Programs/spython.c file in the repo is a fairly thorough example of auditing from C, while doing it from Python looks more like this (taken from my talk：

import sys

def prevent_bitly(event, args):
    if event == 'urllib.Request' and '://bit.ly/' in args[0]:
        print(f'WARNING: urlopen({args[0]}) blocked')
        raise RuntimeError('access to bit.ly is not allowed')

sys.addaudithook(prevent_bitly)

这种方法的缺点是您需要构建和分发您自己的 Python 版本，而不是依赖于系统安装。但是，如果您的应用程序依赖于嵌入，通常这是一个好主意，因为这意味着您不必强制用户进入特定的系统配置。

在嵌入式 Python 中禁用内置模块导入

Disable built-in module import in embedded Python

c++

python

python-embedding

python-3.x

Pre-scan代码

运行时审计