Bitmap Index Scan always Followed by Bitmap Heap Scan for JSON field query

Question

我有以下索引：

CREATE INDEX index_c_profiles_on_city_state_name_domain ON 
c_profiles ((data->>'state'), (data->>'city'), name, domain);

我正在使用以下查询：

SELECT mm.name, mm.domain, mm.data ->> 'city' as city, mm.data ->> 
'state' as state 
FROM c_profiles as mm
WHERE ((mm.data ->> 'state') = 'AZ')

但是当我使用 EXPLAIN ANALYZE 对此进行测试时，它总是先进行位图索引扫描（良好且快速），然后进行非常非常慢的位图堆扫描（通常比单独的索引扫描慢 100 倍）。

我也试过只索引WHERE条件，结果是一样的，它在使用索引后仍然在做非常慢的位图堆扫描。

为什么 Postgres 这样做？我怎样才能让它只进行索引扫描以加快此查询？

这是 EXPLAIN ANALYZE 结果示例：

[
  {
    "Execution Time": 53.655,
    "Planning Time": 0.081,
    "Plan": {
      "Exact Heap Blocks": 1338,
      "Node Type": "Bitmap Heap Scan",
      "Actual Total Time": 53.031,
      "Shared Hit Blocks": 727,
      "Schema": "public",
      "Plans": [
        {
          "Node Type": "Bitmap Index Scan",
          "Actual Total Time": 0.455,
          "Shared Hit Blocks": 2,
          "Shared Read Blocks": 13,
          "Temp Written Blocks": 0,
          "Local Dirtied Blocks": 0,
          "Local Hit Blocks": 0,
          "Plan Width": 0,
          "Actual Loops": 1,
          "Actual Startup Time": 0.455,
          "Temp Read Blocks": 0,
          "Local Read Blocks": 0,
          "Index Name": "index_mattermark_profiles_on_city_state_name_domain",
          "Startup Cost": 0,
          "Shared Dirtied Blocks": 0,
          "Shared Written Blocks": 0,
          "Local Written Blocks": 0,
          "Plan Rows": 788,
          "Index Cond": "((mm.data ->> 'state'::text) = 'AZ'::text)",
          "Actual Rows": 1417,
          "Parent Relationship": "Outer",
          "Total Cost": 34.33
        }
      ],
      "Shared Read Blocks": 650,
      "Relation Name": "mattermark_profiles",
      "Local Hit Blocks": 0,
      "Local Dirtied Blocks": 0,
      "Temp Written Blocks": 0,
      "Plan Width": 1010,
      "Actual Loops": 1,
      "Rows Removed by Index Recheck": 0,
      "Lossy Heap Blocks": 0,
      "Alias": "mm",
      "Recheck Cond": "((mm.data ->> 'state'::text) = 'AZ'::text)",
      "Temp Read Blocks": 0,
      "Output": [
        "name",
        "domain",
        "(data ->> 'city'::text)",
        "(data ->> 'state'::text)"
      ],
      "Actual Startup Time": 0.703,
      "Local Read Blocks": 0,
      "Startup Cost": 34.53,
      "Shared Dirtied Blocks": 0,
      "Shared Written Blocks": 0,
      "Local Written Blocks": 0,
      "Plan Rows": 788,
      "Actual Rows": 1417,
      "Total Cost": 2894.17
    },
    "Triggers": []
  }
]

Answer 1

PostgreSQL 选择位图索引扫描而不是普通的索引扫描，因为它认为它会更快。

当估计的结果行数很高时通常会出现这种情况。

正常的索引扫描必须为找到的每个索引条目访问 table，这会导致 table 上出现大量随机 I/O，并且可能需要相同的块来多次处理。

位图索引扫描首先找到所有索引条目，按照它们在 table 中的物理位置对它们进行排序 ，然后扫描所需的块table。这样效率更高，因为它将按顺序扫描 table 个块。

第二步，位图堆扫描，在EXPLAIN输出中显示为它自己的节点，通常是更昂贵的步骤。

所以一切看起来都井井有条。

您可以尝试将 enable_bitmapscan 设置为 off 并查看 PostgreSQL 是否正确，并且最终的计划会更昂贵。

Bitmap Index Scan always Followed by Bitmap Heap Scan for JSON field query

Bitmap Index Scan always Followed by Bitmap Heap Scan for JSON field query

postgresql

json

jsonb

postgresql-9.5