在遍历 Drupal 8 EntityQuery 结果时达到 PHP 内存限制。我该如何控制它?

Hitting PHP memory limits while iterating over Drupal 8 EntityQuery results. How do I keep it down?

我有一个 D8 API 端点,它查询特定的内容类型,应用任何可选条件,将结果转换为 JSON,并 returns 到客户端。我将 PHP 内存限制更新为 512M,但我仍在 运行 中。 Drupal 中只有 1500 条记录,所以真的不应该有任何理由说明它如此糟糕(每条记录 341KB?!)。如果我只是不断增加内存以使其达到 运行,则呈现的 JSON 小于 2 MB。

我知道 PHP 垃圾收集是自动进行的,所以我猜有些引用被保留了。

我做了几次尝试来降低它,比如批处理查询、重构为函数和显式调用 gc_collect_cycles,但没有任何区别。

如何在迭代 Drupal EntityQuery 的结果时降低内存消耗?

  protected function get() {
    echo "memory (start): " . memory_get_usage() . "\n<br>";

    //some setup and validation

    $query = $this->build_query($params);
    echo "memory (build_query): " . memory_get_usage() . "\n<br>";

    $results = $query->execute();
    echo "memory (execute): " . memory_get_usage() . "\n<br>";

    $items = [];

    $chunk_size = 50;
    $chunks = array_chunk(array_values($results), $chunk_size);
    echo "memory (chunk): " . memory_get_usage() . "\n<br>";

    foreach ($chunks as $chunk) {
      $items = array_merge($items, $this->load_nodes($chunk));
      echo "memory (chunk loaded): " . memory_get_usage() . "\n<br>";
    }
    echo "memory (all loaded): " . memory_get_usage() . "\n<br>";

    $response = [ 'results' => $items ];
    return new ResourceResponse($response);
  }

  protected function load_nodes($ids) {
    $items = [];
    $nodes = node_load_multiple($ids);
    foreach ($nodes as $node) {
      $items[] = $this->transform($node); 
    }
    return $items;
  }

  protected function transform($array) {
    $new = [
      "field1" => $array['field1'],
      "field2" => $array['field2'],
      //... for about 30 more fields, with some processing/manipulation ...
    ];
    return $new;
  }

关于内存回显的输出是:

memory (start): 28297032
memory (build_query): 29984168
memory (execute): 31004048
memory (chunk): 31083864
memory (chunk loaded): 42175976
memory (chunk loaded): 50447792
memory (chunk loaded): 57609344
memory (chunk loaded): 66762688
memory (chunk loaded): 74555712
memory (chunk loaded): 86663016
memory (chunk loaded): 98514192
memory (chunk loaded): 110908336
memory (chunk loaded): 122792592
memory (chunk loaded): 134651328
memory (chunk loaded): 145622512
memory (chunk loaded): 156546072
memory (chunk loaded): 167805352
memory (chunk loaded): 178617040
memory (chunk loaded): 190400936
memory (chunk loaded): 201246256
memory (chunk loaded): 212387384
memory (chunk loaded): 223756088
memory (chunk loaded): 234898632
memory (chunk loaded): 246125624
memory (chunk loaded): 257136304
memory (chunk loaded): 268205304
memory (chunk loaded): 278744896
memory (chunk loaded): 289693184
memory (chunk loaded): 300491840
memory (chunk loaded): 310564624
memory (chunk loaded): 321204064
memory (chunk loaded): 333842760
memory (chunk loaded): 343723672
memory (chunk loaded): 344960728
memory (all loaded): 344960728

当 GC 清理旧引用时,内存消耗不应该在 load_nodes 的每次迭代中保持稳定吗?

您会注意到我的端点最后只有 344 MB。实际错误是在 Drupal 核心的某个地方抛出的。由于我想把最大PHP内存保持在128M,所以我还需要把我那部分内存降下来。

实际上,我认为您关于垃圾回收的假设在此特定情况下是不正确的。

来自 Drupal 8 文档:

function node_load_multiple

Loads node entities from the database.

This function should be used whenever you need to load more than one node from the database. Nodes are loaded into memory and will not require database access if loaded again during the same page request. [source]

它们似乎打算在整个页面请求期间持续存在,这会使内存消耗累积,即使在迭代时也是如此。

我实际上在 Drupal 论坛上看到很多其他开发人员在使用此功能时也遇到内存不足问题的帖子。如果加载的节点很多,内存消耗会特别高。


为了降低内存消耗,我建议通过将缓存重置参数设置为 true 来禁用节点加载的缓存。示例:

$nodes = node_load_multiple($ids, NULL, TRUE);

希望对您有所帮助:)


编辑:

嗯,看来我们尝试重置缓存的方向是正确的,但我们将不得不尝试另一种方法来重置它。这种方法是从已弃用的 node_load() 函数中提取的。

备用重置缓存方法在 Drupal 中的 class 路径是这样的:

\Drupal::entityManager()->getStorage('node')->resetCache(array('NID'));

固定脚本类似于:

$query = \Drupal::entityQuery('node')
     ->condition($params);

$results = $query->execute();

$nids = array_keys($results);

foreach ($nids as $nid) {
    $node = \Drupal\node\Entity\Node::load($nid);

    // Do stuff with loaded node, ex:
    // print $node->title->value;

    // Now reset the cache with the legacy reset cache
    \Drupal::entityManager()->getStorage('node')->resetCache(array($nid));
}

我认为您对 PHP 中的垃圾收集器有误解。

嗯,垃圾收集器释放内存 space 的唯一方法是当此 space 不再被任何变量引用时,同时,您总是从函数返回值这样它们将被高级函数中的其他变量引用。

您还可以查看如何在 drupal 中禁用某些缓存,这可能会对您有所帮助,具体取决于 drupal 使用的缓存策略