在 Tornado 中缓存和重用函数结果

Question

我的 Tornado 应用程序中包含一个昂贵的功能。功能 returns 多个输出，但由于遗留原因，可以访问这些输出分别通过不同的处理程序。

有没有办法只执行一次函数，将结果重新用于不同的处理程序并保留 Tornado 的异步行为？

from tornado.web import RequestHandler
from tonado.ioloop import IOLoop

# the expensive function
def add(x, y):
    z = x + y
    return x, y, z

# the handlers that reuse the function
class Get_X(RequestHandler):
    def get(self, x, y):
        x, y, z = add(x, y) 
        return x

class Get_Y(RequestHandler):
    def get(self, x, y):
        x, y, z = add(x, y) 
        return y

class Get_Z(RequestHandler):
    def get(self, x, y):
        x, y, z = add(x, y) 
        return z

# the web service
application = tornado.web.Application([
    (r'/Get_X', Get_X),
    (r'/Get_Y', Get_Y),
    (r'/Get_Z', Get_Z),
])

application.listen(8888)
IOLoop.current().start()

我考虑过在字典中缓存函数的结果，但我不确定如何让其他两个处理程序等待，而第一个处理程序创建字典条目。

Answer 1

您担心一个处理程序需要时间来计算要放入缓存中的值，而其他处理程序则等待该值出现在缓存中。

Tornado 4.2 包含一个 Event class，您可以使用它来协调需要缓存值的协程。当处理程序想要从缓存中获取值时，它会检查缓存的值是否存在：

from tornado import locks

class Get_X(RequestHandler):
    @gen.coroutine
    def get(self, x, y):
        key = (x, y, 'Get_X')
        if key in cache:
            value = cache[key]
            if isinstance(value, locks.Event):
                # Another coroutine has begun calculating.
                yield value.wait()
                value = cache[key]

            self.write(value)
            return

        # Calculate the value.
        cache[key] = event = locks.Event()
        value = calculate(x, y)
        cache[key] = value
        event.set()
        self.write(value)

此代码未经测试。

在实际代码中，您应该将 calculate 包装在一个 try / 中，除非在 calculate 失败时从缓存中清除事件。否则，所有其他协程将永远等待设置事件。

我假设 calculate returns 一个可以传递给 self.write 的字符串。在您的应用程序中，在调用 self.write 或 self.render.

之前，可能会对值进行进一步处理

您还应该考虑您的缓存可能增长到多大：值有多大，将有多少个不同的键？您可能需要一个有界缓存来驱逐最近最少使用的值； "Python LRU cache" 有很多搜索结果，您可能 try Raymond Hettinger's 因为他广受尊敬。

有关使用事件围绕缓存进行同步的 RequestHandlers 的更复杂示例，请参阅 my proxy example in the Toro documentation。它远非功能齐全的 Web 代理，但编写该示例是为了演示针对您提出的确切问题的解决方案：如何在计算要放入缓存中的值时避免重复工作。

Answer 2

Tornado Futures 是可重复使用的，因此您可以简单地保存 Future 然后再生成它。许多现成的缓存装饰器（如 python 3.2 的 functools.lru_cache 将它们放在 @gen.coroutine:

前面即可

import functools
from tornado import gen
from tornado.ioloop import IOLoop

@functools.lru_cache(maxsize=100)
@gen.coroutine
def expensive_function():
    print('starting expensive_function')
    yield gen.sleep(5)
    return 1, 2, 3

@gen.coroutine
def get_x():
    print('starting get_x')
    x, y, z = yield expensive_function()
    return x

@gen.coroutine
def get_y():
    print('starting get_y')
    x, y, z = yield expensive_function()
    return y

@gen.coroutine
def get_z():
    print('starting get_z')
    x, y, z = yield expensive_function()
    return z

@gen.coroutine
def main():
    x, y, z = yield [get_x(), get_y(), get_z()]
    print(x, y, z)

if __name__ == '__main__':
    IOLoop.current().run_sync(main)

打印：

starting get_x
starting expensive_function
starting get_y
starting get_z
finished expensive_function
1 2 3

在 Tornado 中缓存和重用函数结果

Caching and reusing a function result in Tornado

python

tornado