PostgreSQL 慢 count/group/date_trunc 混合

Question

我有以下查询：

select count(*), date_trunc('day', updated_at) from test group by date_trunc('day', updated_at);

在解释时我有以下内容：

GroupAggregate  (cost=213481.83..223749.85 rows=245009 width=8)
  ->  Sort  (cost=213481.83..215883.63 rows=960720 width=8)
      Sort Key: (date_trunc('day'::text, updated_at))
  ->  Index Only Scan using updatedat on test  (cost=0.00..91745.26 rows=960720 width=8)

如你所见，成本很高，查询时间为6231.58毫秒

有什么办法可以改善吗？为这种 count/group/date_trunc 组合创建的最佳索引应该是什么。

Answer 1

如果您的 table 中真的有 250000 个不同的日子，您可能没有比这更好的了。不过，增加 work_mem 会加快排序速度。

但是，如果不同天数明显减少，问题是 PostgreSQL 无法估计 date_trunc 结果的分布，除非您创建索引：

CREATE INDEX ON test (date_trunc('day', updated_at));

如果 updated_at 是 timestamp without time zone，则可以正常工作。对于 timestamp with time zone，您必须指定一个时区，否则结果将取决于会话时区，这使得它无法用于索引：

CREATE INDEX ON test (date_trunc('day', updated_at AT TIME ZONE 'UTC'));

然后 ANALYZE table，增加 work_mem 并查看是否可以获得哈希聚合而不是排序。

当然，如果你必须在索引定义中使用AT TIME ZONE，你也必须在查询中使用它...

PostgreSQL 慢 count/group/date_trunc 混合

PostgreSQL slow count/group/date_trunc mix

postgresql

database-performance