使用 tornado ioloop 将大 python pickle 文件加载到内存中

Question

我正在构建一个测试服务器，它会在命中端点时加载一个巨大的 pickle 文件（大约需要 30 秒）。我的目标是更新它以在龙卷风 Web 服务器作为单独的线程启动时将 pickle 作为 python 对象加载到后台内存中。因此，当命中端点时，它要么在内存中找到它，要么等待线程完成加载。这样启动速度会快很多。

我在这里寻求一些关于添加异步以使此操作正常运行的最佳方法的建议。

my_server.py

    import tornado.ioloop
    import tornado.web

    from my_class import MyClass

    class MainHandler(tornado.web.RequestHandler):
        def get(self):
            m = MyClass.get_foobar_object_by_name('foobar')
            self.write("Hello, world")

    def make_app():
        return tornado.web.Application([
            (r"/", MainHandler),
        ])

    if __name__ == "__main__":
        app = make_app()
        app.listen(8888)
        MyClass.load()  # takes 30s to load
        tornado.ioloop.IOLoop.current().start()

my_class.py

    class MyClass(object):
        pickle_path = '/opt/some/path/big_file.pickle'
        foobar_map = None

        @staticmethod
        def load():
            # this step takes about 30s to load
            MyClass.foobar_map = pickle.load(open(local_path, 'rb'))

        @staticmethod
        def get_foobar_object_by_name(foobar_name):
            if MyClass.foobar_map is None:
                MyClass.load()
            return MyClass.foobar_map.get(foobar_name)

Answer 1

pickle模块有一个同步接口，所以异步运行它的唯一方法是在另一个线程上运行它。在 Tornado 5.0 中使用新的 IOLoop.run_in_executor 界面：

from tornado.ioloop import IOLoop
from tornado.web import RequestHandler
from tornado.locks import Lock

class MyClass:
    lock = Lock()

    @staticmethod
    async def load():
        async with MyClass.lock():
            # Check again inside the lock to make sure we only do this once. 
            if MyClass.foobar_map is None:
                MyClass.foobar_map = await IOLoop.current().run_in_executor(None, pickle.load, open(local_path, 'rb'))

    @staticmethod
    async def get_foobar_object_by_name(foobar_name):
        if MyClass.foobar_map is None:
            await MyClass.load()
        return MyClass.foobar_map.get(foobar_name)

class MainHandler(RequestHandler):
    async def get(self):
        m = await MyClass.get_foobar_object_by_name('foobar')
        self.write("Hello, world")

请注意 async 具有传染性：任何调用 async 函数的东西也需要 async 并使用 await。

使用 tornado ioloop 将大 python pickle 文件加载到内存中

Using tornado ioloop for loading big python pickle file into memory

python

asynchronous

tornado