在 App Engine NDB 模型中，是否需要显式缓存对相关模型的引用以最小化查询成本并优化性能？

Question

App Engine documentation on NDB caching表示缓存默认开启：

NDB automatically caches data that it writes or reads (unless an application configures it not to).

我希望这意味着我可以依靠它以经济高效的方式管理与关键相关的模型。这是一个涉及两个具有一对多关系的模型的简单示例。

一个用户模型（有很多评论）：

class User(ndb.Model):
    name                    = ndb.StringProperty(required=True)
    email                   = ndb.StringProperty(required=True)

    def comments(self, limit=25):
        return UserComment.query(UserComment.user_key == self.key) \
                          .order(-UserComment.created_at) \
                          .fetch(limit)

一个评论模型（每条评论属于一个用户）：

class UserComment(ndb.Model):
    user_key                = ndb.KeyProperty(required=True)
    text                    = ndb.StringProperty(required=True)
    created_at              = ndb.DateTimeProperty(auto_now_add=True)

    @property
    def user(self):
        return self.user_key.get()

以及一个显示评论的模板，其中包含对 comment.user:

的两个引用

<div class="comment">
  <div class="body">
    {{ comment.text }}
  </div>
  <div class="footer">
    by {{ comment.user.name }} ({{ comment.user.email }})
  </div>
</div>

这是一个合理的模式吗？对 comment.user.name 和 comment.user.email 的每次引用都会产生单独的 query cost 还是可以信任自动 NDB 缓存来避免或最小化这种情况？

同样，使用User.comments方法，是否可以信任自动缓存将成本降至最低？或者是否建议添加明确使用内存缓存的代码？

Answer 1

NDB 缓存涵盖了实体本身。来自您提到的文档：

NDB automatically caches data that it writes or reads (unless an application configures it not to). Reading from cache is faster than reading from the Datastore.

这意味着默认情况下您不需要为直接 key/property 查找（如 comment.user.name 和 comment.user.email 手动处理缓存，ndb 会处理这些。

但是查询是另一回事 - 无法知道查询 return 编辑的数据是否仍然是以后重复的相同查询的有效响应 - 可能已经创建了额外的数据同时。缓存查询结果是 the documentation:

中提到的第一个 memcache 使用

One use of a memory cache is to speed up common datastore queries. If many requests make the same query with the same parameters, and changes to the results do not need to appear on the web site right away, the app can cache the results in the memcache. Subsequent requests can check the memcache, and only perform the datastore query if the results are absent or expired. Session data, user preferences, and any other queries performed on most pages of a site are good candidates for caching.

换句话说，您应该尝试手动缓存诸如 User.comments 之类的内容，这些内容依赖于对其 return 值的查询。

Answer 2

每次调用 UserComment.user() 都会从数据存储区执行一次获取。但是，来自同一个键的第二次调用 self.user_key.get() 将 return 缓存的实体。

但无论如何......这是一个糟糕的设计，因为您正在对数据存储执行单独的 RPC 调用，而不是使用 get_multi() 一次检索所有用户实体。

在 App Engine NDB 模型中，是否需要显式缓存对相关模型的引用以最小化查询成本并优化性能？

In an App Engine NDB model, do references to related models need to be explicitly cached to minimize query costs and optimize performance?

python

google-app-engine

app-engine-ndb

google-cloud-datastore