查询中未使用 PostgreSql jsonb 列上的 GIN 索引

Question

我正在使用 PostgreSql 9.6。（请不要让我升级 - 我必须使用 9.6）

我有一个 table，它有一个 jsonb 列。我在这个列上创建了一个 GIN 索引。 table 有 320,000 条记录。 "explain analyse" 表明索引没有被使用，一个简单的查询大约需要 3 秒。

我们有一个调试记录器，可以记录任何内容，但将其存储为 JSON，格式为 { "key1":"value1", "key2":"value2", ...}

我们通过提取键的值来收集统计数据。

table 和索引是这样创建的：

CREATE TABLE log ( 
  id SERIAL PRIMARY KEY,
  logEntry jsonb
);

CREATE INDEX log_idx_logentry on log using gin (logentry);

我运行我知道的查询 return 没有结果：

SELECT id FROM log WHERE logentry->>'modality' = 'XT'

运行需要 3 秒。

EXPLAIN ANALYSE SELECT id FROM log WHERE logentry->>'modality' = 'XT' produces:

 Seq Scan on log  (cost=0.00..32458.90 rows=1618 width=4) (actual time=1328.654..1328.660 rows=0 loops=1)
 Filter: ((logentry ->> 'modality'::text) = 'XT'::text)
 Rows Removed by Filter: 323527
 Planning time: 0.450 ms
 Execution time: 1328.724 ms
(5 rows)

如果我将查询写成类似的结果：

EXPLAIN ANALYSE SELECT id FROM log WHERE logentry->'modality' @> '"XT"'::jsonb

 Seq Scan on log  (cost=0.00..32458.90 rows=324 width=4) (actual time=1421.262..1421.266 rows=0 loops=1)
   Filter: ((logentry -> 'modality'::text) @> '"XT"'::jsonb)
   Rows Removed by Filter: 323527
 Planning time: 0.080 ms
 Execution time: 1421.309 ms
(5 rows)

而且，只是为了证明table，

中有东西

SELECT COUNT(id) FROM log WHERE logentry->'modality' @> '"CT"'::jsonb

returns 42528

那么为什么没有使用索引？ 在生产中我们希望日志 table 包含数百万条记录。

Answer 1

klin 有正确答案。随着数据库变大，性能差异变得更加明显。

查询中未使用 PostgreSql jsonb 列上的 GIN 索引

GIN index on PostgreSql jsonb column not being used in queries

postgresql

indexing

jsonb