如何使用 importlib 实现可以动态修改源代码的导入挂钩?

How to implement an import hook that can modify the source code on the fly using importlib?

使用已弃用的模块 imp,我可以编写自定义导入挂钩,在 importation/execution 之前 Python 动态修改模块的源代码。给定下面名为 source 的字符串形式的源代码,创建模块所需的基本代码如下:

module = imp.new_module(name)
sys.modules[name] = module
exec(source, module.__dict__)

由于 imp 已弃用,我想对 importlib 做一些类似的事情。 [编辑:需要替换其他 imp 方法来构建自定义导入挂钩 - 所以我正在寻找的答案不仅仅是替换上面的代码。]

但是,我还不知道该怎么做。据我所知, importlib documentation has a function to create modules from "specs" 是包含自己的加载器的对象,没有明显的方法可以重新定义它们以便能够从字符串创建模块。

我创建了一个 minimal example 来演示这一点;有关详细信息,请参阅自述文件。

find_moduleload_module 均已弃用。您需要分别切换到 find_spec 和(create_moduleexec_module)模块。有关详细信息,请参阅 importlib documentation

您还需要检查是否要使用 MetaPathFinderPathEntryFinder,因为调用它们的系统不同。也就是说,元路径查找器首先出现并可以覆盖内置模块,而路径条目查找器专门用于在 sys.path.

上找到的模块

以下是一个非常基本的进口商,试图更换整个进口机械。它显示了如何使用函数 (find_speccreate_moduleexec_module)。

import sys
import os.path

from importlib.abc import Loader, MetaPathFinder
from importlib.util import spec_from_file_location

class MyMetaFinder(MetaPathFinder):
    def find_spec(self, fullname, path, target=None):
        if path is None or path == "":
            path = [os.getcwd()] # top level import -- 
        if "." in fullname:
            *parents, name = fullname.split(".")
        else:
            name = fullname
        for entry in path:
            if os.path.isdir(os.path.join(entry, name)):
                # this module has child modules
                filename = os.path.join(entry, name, "__init__.py")
                submodule_locations = [os.path.join(entry, name)]
            else:
                filename = os.path.join(entry, name + ".py")
                submodule_locations = None
            if not os.path.exists(filename):
                continue

            return spec_from_file_location(fullname, filename, loader=MyLoader(filename),
                submodule_search_locations=submodule_locations)

        return None # we don't know how to import this

class MyLoader(Loader):
    def __init__(self, filename):
        self.filename = filename

    def create_module(self, spec):
        return None # use default module creation semantics

    def exec_module(self, module):
        with open(self.filename) as f:
            data = f.read()

        # manipulate data some way...

        exec(data, vars(module))

def install():
    """Inserts the finder into the import machinery"""
    sys.meta_path.insert(0, MyMetaFinder())

接下来是一个稍微更精致的版本,它试图重用更多的进口机器。因此,您只需要定义如何获取模块的源代码。

import sys
from os.path import isdir
from importlib import invalidate_caches
from importlib.abc import SourceLoader
from importlib.machinery import FileFinder


class MyLoader(SourceLoader):
    def __init__(self, fullname, path):
        self.fullname = fullname
        self.path = path

    def get_filename(self, fullname):
        return self.path

    def get_data(self, filename):
        """exec_module is already defined for us, we just have to provide a way
        of getting the source code of the module"""
        with open(filename) as f:
            data = f.read()
        # do something with data ...
        # eg. ignore it... return "print('hello world')"
        return data


loader_details = MyLoader, [".py"]

def install():
    # insert the path hook ahead of other path hooks
    sys.path_hooks.insert(0, FileFinder.path_hook(loader_details))
    # clear any loaders that might already be in use by the FileFinder
    sys.path_importer_cache.clear()
    invalidate_caches()

另见这个不错的项目https://pypi.org/project/importhook/

pip install importhook
import importhook

# Setup hook to be called any time the `socket` module is imported and loaded into module cache
@importhook.on_import('socket')
def on_socket_import(socket):
    new_socket = importhook.copy_module(socket)
    setattr(new_socket, 'gethostname', lambda: 'patched-hostname')
    return new_socket

# Import module
import socket

# Prints: 'patched-hostname'
print(socket.gethostname())