具有多个自连接的大型 table 空间查询执行缓慢
Spatial query on large table with multiple self joins performing slow
我正在 Postgres 9.3.9 中处理大型 table 的查询。它是一个空间数据集,并且具有空间索引。比如说,我需要找到3种类型的物体:A、B和C。条件是B和C都在A的一定距离内,比如500米。
我的查询是这样的:
select
school.osm_id as school_osm_id,
school.name as school_name,
school.way as school_way,
restaurant.osm_id as restaurant_osm_id,
restaurant.name as restaurant_name,
restaurant.way as restaurant_way,
bar.osm_id as bar_osm_id,
bar.name as bar_name,
bar.way as bar_way
from (
select osm_id, name, amenity, way, way_geo
from planet_osm_point
where amenity = 'school') as school,
(select osm_id, name, amenity, way, way_geo
from planet_osm_point
where amenity = 'restaurant') as restaurant,
(select osm_id, name, amenity, way, way_geo
from planet_osm_point
where amenity = 'bar') as bar
where ST_DWithin(school.way_geo, restaurant.way_geo, 500, false)
and ST_DWithin(school.way_geo, bar.way_geo, 500, false);
这个查询给了我我想要的,但它需要很长时间,比如 13 秒来执行。我想知道是否有另一种方法来编写查询并提高效率。
查询计划:
Nested Loop (cost=74.43..28618.65 rows=1 width=177) (actual time=33.513..11235.212 rows=10591 loops=1)
Buffers: shared hit=530967 read=8733
-> Nested Loop (cost=46.52..28586.46 rows=1 width=174) (actual time=31.998..9595.212 rows=4235 loops=1)
Buffers: shared hit=389863 read=8707
-> Bitmap Heap Scan on planet_osm_point (cost=18.61..2897.83 rows=798 width=115) (actual time=7.862..150.607 rows=8811 loops=1)
Recheck Cond: (amenity = 'school'::text)
Buffers: shared hit=859 read=5204
-> Bitmap Index Scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=5.416..5.416 rows=8811 loops=1)
Index Cond: (amenity = 'school'::text)
Buffers: shared hit=3 read=24
-> Bitmap Heap Scan on planet_osm_point planet_osm_point_1 (cost=27.91..32.18 rows=1 width=115) (actual time=1.064..1.069 rows=0 loops=8811)
Recheck Cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) AND (amenity = 'restaurant'::text))
Filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) AND _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false))
Rows Removed by Filter: 0
Buffers: shared hit=389004 read=3503
-> BitmapAnd (cost=27.91..27.91 rows=1 width=0) (actual time=1.058..1.058 rows=0 loops=8811)
Buffers: shared hit=384528 read=2841
-> Bitmap Index Scan on idx_planet_osm_point_waygeo (cost=0.00..9.05 rows=137 width=0) (actual time=0.193..0.193 rows=64 loops=8811)
Index Cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision))
Buffers: shared hit=146631 read=2841
-> Bitmap Index Scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=0.843..0.843 rows=6291 loops=8811)
Index Cond: (amenity = 'restaurant'::text)
Buffers: shared hit=237897
-> Bitmap Heap Scan on planet_osm_point planet_osm_point_2 (cost=27.91..32.18 rows=1 width=115) (actual time=0.375..0.383 rows=3 loops=4235)
Recheck Cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) AND (amenity = 'bar'::text))
Filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) AND _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false))
Rows Removed by Filter: 1
Buffers: shared hit=141104 read=26
-> BitmapAnd (cost=27.91..27.91 rows=1 width=0) (actual time=0.368..0.368 rows=0 loops=4235)
Buffers: shared hit=127019
-> Bitmap Index Scan on idx_planet_osm_point_waygeo (cost=0.00..9.05 rows=137 width=0) (actual time=0.252..0.252 rows=363 loops=4235)
Index Cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision))
Buffers: shared hit=101609
-> Bitmap Index Scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=0.104..0.104 rows=779 loops=4235)
Index Cond: (amenity = 'bar'::text)
Buffers: shared hit=25410
Total runtime: 11238.605 ms
我目前只使用一个 table 1,372,711 行。它有 73 列:
Column | Type | Modifiers
--------------------+----------------------+---------------------------
osm_id | bigint |
access | text |
addr:housename | text |
addr:housenumber | text |
addr:interpolation | text |
admin_level | text |
aerialway | text |
aeroway | text |
amenity | text |
area | text |
barrier | text |
bicycle | text |
brand | text |
bridge | text |
boundary | text |
building | text |
capital | text |
construction | text |
covered | text |
culvert | text |
cutting | text |
denomination | text |
disused | text |
ele | text |
embankment | text |
foot | text |
generator:source | text |
harbour | text |
highway | text |
historic | text |
horse | text |
intermittent | text |
junction | text |
landuse | text |
layer | text |
leisure | text |
lock | text |
man_made | text |
military | text |
motorcar | text |
name | text |
natural | text |
office | text |
oneway | text |
operator | text |
place | text |
poi | text |
population | text |
power | text |
power_source | text |
public_transport | text |
railway | text |
ref | text |
religion | text |
route | text |
service | text |
shop | text |
sport | text |
surface | text |
toll | text |
tourism | text |
tower:type | text |
tunnel | text |
water | text |
waterway | text |
wetland | text |
width | text |
wood | text |
z_order | integer |
tags | hstore |
way | geometry(Point,4326) |
way_geo | geography |
gid | integer | not null default nextval('...
Indexes:
"planet_osm_point_pkey1" PRIMARY KEY, btree (gid)
"idx_planet_osm_point_amenity" btree (amenity)
"idx_planet_osm_point_waygeo" gist (way_geo)
"planet_osm_point_index" gist (way)
"planet_osm_point_pkey" btree (osm_id)
便利学校、餐厅、酒吧分别有8811、6291、779排。
如果使用显式连接,会有什么不同吗?
SELECT a.id as a_id, a.name as a_name, a.geog as a_geog,
b.id as b_id, b.name as b_name, b.geog as b_geog,
c.id as c_id, c.name as c_name, c.geog as c_geog
FROM table1 a
JOIN table1 b ON b.type = 'B' AND ST_DWithin(a.geog, b.geog, 100)
JOIN table1 c ON c.type = 'C' AND ST_DWithin(a.geog, c.geog, 100)
WHERE a.type = 'A';
用内连接语法试试这个并比较结果,应该没有重复的。我的猜测是它应该比原始查询花费 1/3 或更好的时间:
select a.id as a_id, a.name as a_name, a.geog as a_geo,
b.id as b_id, b.name as b_name, b.geog as b_geo,
c.id as c_id, c.name as c_name, c.geog as c_geo
from table1 as a
INNER JOIN table1 as b on b.type='B'
INNER JOIN table1 as c on c.type='C'
WHERE a.type='A' and
(ST_DWithin(a.geo, b.geo, 100) and ST_DWithin(a.geo, c.geo, 100))
您使用的 3 个子选择非常低效。将它们写成 LEFT JOIN
子句,查询应该更有效率:
SELECT
school.osm_id AS school_osm_id,
school.name AS school_name,
school.way AS school_way,
restaurant.osm_id AS restaurant_osm_id,
restaurant.name AS restaurant_name,
restaurant.way AS restaurant_way,
bar.osm_id AS bar_osm_id,
bar.name AS bar_name,
bar.way AS bar_way
FROM planet_osm_point school
LEFT JOIN planet_osm_point restaurant ON restaurant.amenity = 'restaurant' AND
ST_DWithin(school.way_geo, restaurant.way_geo, 500, false)
LEFT JOIN planet_osm_point bar ON bar.amenity = 'bar' AND
ST_DWithin(school.way_geo, bar.way_geo, 500, false)
WHERE school.amenity = 'school'
AND (restaurant.osm_id IS NOT NULL OR bar.osm_id IS NOT NULL);
但是,如果每所学校有多家餐馆和酒吧,这会给出太多结果。您可以像这样简化查询:
SELECT
school.osm_id AS school_osm_id,
school.name AS school_name,
school.way AS school_way,
a.osm_id AS amenity_osm_id,
a.amenity AS amenity_type,
a.name AS amenity_name,
a.way AS amenity_way,
FROM planet_osm_point school
JOIN planet_osm_point a ON ST_DWithin(school.way_geo, a.way_geo, 500, false)
WHERE school.amenity = 'school'
AND a.amenity IN ('bar', 'restaurant');
这将为每所学校的每个酒吧和餐厅提供。 500m以内既没有餐厅也没有酒吧的学校未列出。
这个查询应该有很长的路要走(快很多):
WITH school AS (
SELECT s.osm_id AS school_id, text 'school' AS type, s.osm_id, s.name, s.way_geo
FROM planet_osm_point s
, LATERAL (
SELECT 1 FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'bar'
LIMIT 1 -- bar exists -- most selective first if possible
) b
, LATERAL (
SELECT 1 FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'restaurant'
LIMIT 1 -- restaurant exists
) r
WHERE s.amenity = 'school'
)
SELECT * FROM (
TABLE school -- schools
UNION ALL -- bars
SELECT s.school_id, 'bar', x.*
FROM school s
, LATERAL (
SELECT osm_id, name, way_geo
FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'bar'
) x
UNION ALL -- restaurants
SELECT s.school_id, 'rest.', x.*
FROM school s
, LATERAL (
SELECT osm_id, name, way_geo
FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'restaurant'
) x
) sub
ORDER BY school_id, (type <> 'school'), type, osm_id;
这 不是 与您的原始查询相同,而是您真正想要的 :
I want a list of schools that have restaurants and bars within 500
meters and I need the coordinates of each school and its corresponding
restaurants and bars.
所以这个查询 returns 这些学校的列表,然后是附近的酒吧和餐馆。每组行由 school_id
列中学校的 osm_id
保持在一起。
现在使用 LATERAL
连接,以利用空间 GiST 索引。
TABLE school
只是 shorthand 对于 SELECT * FROM school
:
表达式 (type <> 'school')
将每个集合中的学校排在第一位,因为:
- SQL select query order by day and month
最后SELECT
中的子查询sub
只需要按这个表达式排序。 UNION
查询将附加的 ORDER BY
列表限制为只有列,没有表达式。
我专注于您为回答此问题而提出的查询 - 忽略 对其他 70 个文本列中的任何一个进行过滤的扩展要求。这真的是一个设计缺陷。搜索条件应集中在 少数 列中。或者您必须为所有 70 列建立索引,而像我要提议的那样的多列索引几乎不是一个选项。仍然 可能 虽然 ...
索引
除了现有的:
"idx_planet_osm_point_waygeo" gist (way_geo)
如果始终在同一列上进行过滤,则可以创建 multicolumn index covering the few columns you are interested in, so index-only scans 成为可能:
CREATE INDEX planet_osm_point_bar_idx ON planet_osm_point (amenity, name, osm_id)
Postgres 9.5
即将推出的 Postgres 9.5 引入了 重大改进 正好可以解决您的问题:
Allow queries to perform accurate distance filtering of bounding-box-indexed objects (polygons, circles) using GiST indexes
(Alexander Korotkov, Heikki Linnakangas)
Previously, a common table expression was required to return a large
number of rows ordered by bounding-box distance, and then filtered
further with a more accurate non-bounding-box distance calculation.
Allow GiST indexes to perform index-only scans (Anastasia Lubennikova, Heikki Linnakangas, Andreas Karlsson)
您对此特别感兴趣。现在你可以有一个单多列(覆盖)GiST索引:
CREATE INDEX reservations_range_idx ON reservations
USING gist(amenity, way_geo, name, osm_id)
并且:
- Improve bitmap index scan performance (Teodor Sigaev, Tom Lane)
并且:
- Add GROUP BY analysis functions
GROUPING SETS
, CUBE
and ROLLUP
(Andrew Gierth, Atri Sharma)
为什么?因为 ROLLUP
会简化我建议的查询。相关回答:
第一个 alpha 版本已于 2015 年 7 月 2 日发布。The expected timeline for the release:
This is the alpha release of version 9.5, indicating that some changes
to features are still possible before release. The PostgreSQL Project
will release 9.5 beta 1 in August, and then periodically release
additional betas as required for testing until the final release in
late 2015.
基础知识
当然,一定不要忽视基础知识:
我正在 Postgres 9.3.9 中处理大型 table 的查询。它是一个空间数据集,并且具有空间索引。比如说,我需要找到3种类型的物体:A、B和C。条件是B和C都在A的一定距离内,比如500米。
我的查询是这样的:
select
school.osm_id as school_osm_id,
school.name as school_name,
school.way as school_way,
restaurant.osm_id as restaurant_osm_id,
restaurant.name as restaurant_name,
restaurant.way as restaurant_way,
bar.osm_id as bar_osm_id,
bar.name as bar_name,
bar.way as bar_way
from (
select osm_id, name, amenity, way, way_geo
from planet_osm_point
where amenity = 'school') as school,
(select osm_id, name, amenity, way, way_geo
from planet_osm_point
where amenity = 'restaurant') as restaurant,
(select osm_id, name, amenity, way, way_geo
from planet_osm_point
where amenity = 'bar') as bar
where ST_DWithin(school.way_geo, restaurant.way_geo, 500, false)
and ST_DWithin(school.way_geo, bar.way_geo, 500, false);
这个查询给了我我想要的,但它需要很长时间,比如 13 秒来执行。我想知道是否有另一种方法来编写查询并提高效率。
查询计划:
Nested Loop (cost=74.43..28618.65 rows=1 width=177) (actual time=33.513..11235.212 rows=10591 loops=1)
Buffers: shared hit=530967 read=8733
-> Nested Loop (cost=46.52..28586.46 rows=1 width=174) (actual time=31.998..9595.212 rows=4235 loops=1)
Buffers: shared hit=389863 read=8707
-> Bitmap Heap Scan on planet_osm_point (cost=18.61..2897.83 rows=798 width=115) (actual time=7.862..150.607 rows=8811 loops=1)
Recheck Cond: (amenity = 'school'::text)
Buffers: shared hit=859 read=5204
-> Bitmap Index Scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=5.416..5.416 rows=8811 loops=1)
Index Cond: (amenity = 'school'::text)
Buffers: shared hit=3 read=24
-> Bitmap Heap Scan on planet_osm_point planet_osm_point_1 (cost=27.91..32.18 rows=1 width=115) (actual time=1.064..1.069 rows=0 loops=8811)
Recheck Cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) AND (amenity = 'restaurant'::text))
Filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) AND _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false))
Rows Removed by Filter: 0
Buffers: shared hit=389004 read=3503
-> BitmapAnd (cost=27.91..27.91 rows=1 width=0) (actual time=1.058..1.058 rows=0 loops=8811)
Buffers: shared hit=384528 read=2841
-> Bitmap Index Scan on idx_planet_osm_point_waygeo (cost=0.00..9.05 rows=137 width=0) (actual time=0.193..0.193 rows=64 loops=8811)
Index Cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision))
Buffers: shared hit=146631 read=2841
-> Bitmap Index Scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=0.843..0.843 rows=6291 loops=8811)
Index Cond: (amenity = 'restaurant'::text)
Buffers: shared hit=237897
-> Bitmap Heap Scan on planet_osm_point planet_osm_point_2 (cost=27.91..32.18 rows=1 width=115) (actual time=0.375..0.383 rows=3 loops=4235)
Recheck Cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) AND (amenity = 'bar'::text))
Filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) AND _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false))
Rows Removed by Filter: 1
Buffers: shared hit=141104 read=26
-> BitmapAnd (cost=27.91..27.91 rows=1 width=0) (actual time=0.368..0.368 rows=0 loops=4235)
Buffers: shared hit=127019
-> Bitmap Index Scan on idx_planet_osm_point_waygeo (cost=0.00..9.05 rows=137 width=0) (actual time=0.252..0.252 rows=363 loops=4235)
Index Cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision))
Buffers: shared hit=101609
-> Bitmap Index Scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=0.104..0.104 rows=779 loops=4235)
Index Cond: (amenity = 'bar'::text)
Buffers: shared hit=25410
Total runtime: 11238.605 ms
我目前只使用一个 table 1,372,711 行。它有 73 列:
Column | Type | Modifiers
--------------------+----------------------+---------------------------
osm_id | bigint |
access | text |
addr:housename | text |
addr:housenumber | text |
addr:interpolation | text |
admin_level | text |
aerialway | text |
aeroway | text |
amenity | text |
area | text |
barrier | text |
bicycle | text |
brand | text |
bridge | text |
boundary | text |
building | text |
capital | text |
construction | text |
covered | text |
culvert | text |
cutting | text |
denomination | text |
disused | text |
ele | text |
embankment | text |
foot | text |
generator:source | text |
harbour | text |
highway | text |
historic | text |
horse | text |
intermittent | text |
junction | text |
landuse | text |
layer | text |
leisure | text |
lock | text |
man_made | text |
military | text |
motorcar | text |
name | text |
natural | text |
office | text |
oneway | text |
operator | text |
place | text |
poi | text |
population | text |
power | text |
power_source | text |
public_transport | text |
railway | text |
ref | text |
religion | text |
route | text |
service | text |
shop | text |
sport | text |
surface | text |
toll | text |
tourism | text |
tower:type | text |
tunnel | text |
water | text |
waterway | text |
wetland | text |
width | text |
wood | text |
z_order | integer |
tags | hstore |
way | geometry(Point,4326) |
way_geo | geography |
gid | integer | not null default nextval('...
Indexes:
"planet_osm_point_pkey1" PRIMARY KEY, btree (gid)
"idx_planet_osm_point_amenity" btree (amenity)
"idx_planet_osm_point_waygeo" gist (way_geo)
"planet_osm_point_index" gist (way)
"planet_osm_point_pkey" btree (osm_id)
便利学校、餐厅、酒吧分别有8811、6291、779排。
如果使用显式连接,会有什么不同吗?
SELECT a.id as a_id, a.name as a_name, a.geog as a_geog,
b.id as b_id, b.name as b_name, b.geog as b_geog,
c.id as c_id, c.name as c_name, c.geog as c_geog
FROM table1 a
JOIN table1 b ON b.type = 'B' AND ST_DWithin(a.geog, b.geog, 100)
JOIN table1 c ON c.type = 'C' AND ST_DWithin(a.geog, c.geog, 100)
WHERE a.type = 'A';
用内连接语法试试这个并比较结果,应该没有重复的。我的猜测是它应该比原始查询花费 1/3 或更好的时间:
select a.id as a_id, a.name as a_name, a.geog as a_geo,
b.id as b_id, b.name as b_name, b.geog as b_geo,
c.id as c_id, c.name as c_name, c.geog as c_geo
from table1 as a
INNER JOIN table1 as b on b.type='B'
INNER JOIN table1 as c on c.type='C'
WHERE a.type='A' and
(ST_DWithin(a.geo, b.geo, 100) and ST_DWithin(a.geo, c.geo, 100))
您使用的 3 个子选择非常低效。将它们写成 LEFT JOIN
子句,查询应该更有效率:
SELECT
school.osm_id AS school_osm_id,
school.name AS school_name,
school.way AS school_way,
restaurant.osm_id AS restaurant_osm_id,
restaurant.name AS restaurant_name,
restaurant.way AS restaurant_way,
bar.osm_id AS bar_osm_id,
bar.name AS bar_name,
bar.way AS bar_way
FROM planet_osm_point school
LEFT JOIN planet_osm_point restaurant ON restaurant.amenity = 'restaurant' AND
ST_DWithin(school.way_geo, restaurant.way_geo, 500, false)
LEFT JOIN planet_osm_point bar ON bar.amenity = 'bar' AND
ST_DWithin(school.way_geo, bar.way_geo, 500, false)
WHERE school.amenity = 'school'
AND (restaurant.osm_id IS NOT NULL OR bar.osm_id IS NOT NULL);
但是,如果每所学校有多家餐馆和酒吧,这会给出太多结果。您可以像这样简化查询:
SELECT
school.osm_id AS school_osm_id,
school.name AS school_name,
school.way AS school_way,
a.osm_id AS amenity_osm_id,
a.amenity AS amenity_type,
a.name AS amenity_name,
a.way AS amenity_way,
FROM planet_osm_point school
JOIN planet_osm_point a ON ST_DWithin(school.way_geo, a.way_geo, 500, false)
WHERE school.amenity = 'school'
AND a.amenity IN ('bar', 'restaurant');
这将为每所学校的每个酒吧和餐厅提供。 500m以内既没有餐厅也没有酒吧的学校未列出。
这个查询应该有很长的路要走(快很多):
WITH school AS (
SELECT s.osm_id AS school_id, text 'school' AS type, s.osm_id, s.name, s.way_geo
FROM planet_osm_point s
, LATERAL (
SELECT 1 FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'bar'
LIMIT 1 -- bar exists -- most selective first if possible
) b
, LATERAL (
SELECT 1 FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'restaurant'
LIMIT 1 -- restaurant exists
) r
WHERE s.amenity = 'school'
)
SELECT * FROM (
TABLE school -- schools
UNION ALL -- bars
SELECT s.school_id, 'bar', x.*
FROM school s
, LATERAL (
SELECT osm_id, name, way_geo
FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'bar'
) x
UNION ALL -- restaurants
SELECT s.school_id, 'rest.', x.*
FROM school s
, LATERAL (
SELECT osm_id, name, way_geo
FROM planet_osm_point
WHERE ST_DWithin(way_geo, s.way_geo, 500, false)
AND amenity = 'restaurant'
) x
) sub
ORDER BY school_id, (type <> 'school'), type, osm_id;
这 不是 与您的原始查询相同,而是您真正想要的
I want a list of schools that have restaurants and bars within 500 meters and I need the coordinates of each school and its corresponding restaurants and bars.
所以这个查询 returns 这些学校的列表,然后是附近的酒吧和餐馆。每组行由 school_id
列中学校的 osm_id
保持在一起。
现在使用 LATERAL
连接,以利用空间 GiST 索引。
TABLE school
只是 shorthand 对于 SELECT * FROM school
:
表达式 (type <> 'school')
将每个集合中的学校排在第一位,因为:
- SQL select query order by day and month
最后SELECT
中的子查询sub
只需要按这个表达式排序。 UNION
查询将附加的 ORDER BY
列表限制为只有列,没有表达式。
我专注于您为回答此问题而提出的查询 - 忽略 对其他 70 个文本列中的任何一个进行过滤的扩展要求。这真的是一个设计缺陷。搜索条件应集中在 少数 列中。或者您必须为所有 70 列建立索引,而像我要提议的那样的多列索引几乎不是一个选项。仍然 可能 虽然 ...
索引
除了现有的:
"idx_planet_osm_point_waygeo" gist (way_geo)
如果始终在同一列上进行过滤,则可以创建 multicolumn index covering the few columns you are interested in, so index-only scans 成为可能:
CREATE INDEX planet_osm_point_bar_idx ON planet_osm_point (amenity, name, osm_id)
Postgres 9.5
即将推出的 Postgres 9.5 引入了 重大改进 正好可以解决您的问题:
Allow queries to perform accurate distance filtering of bounding-box-indexed objects (polygons, circles) using GiST indexes (Alexander Korotkov, Heikki Linnakangas)
Previously, a common table expression was required to return a large number of rows ordered by bounding-box distance, and then filtered further with a more accurate non-bounding-box distance calculation.
Allow GiST indexes to perform index-only scans (Anastasia Lubennikova, Heikki Linnakangas, Andreas Karlsson)
您对此特别感兴趣。现在你可以有一个单多列(覆盖)GiST索引:
CREATE INDEX reservations_range_idx ON reservations
USING gist(amenity, way_geo, name, osm_id)
并且:
- Improve bitmap index scan performance (Teodor Sigaev, Tom Lane)
并且:
- Add GROUP BY analysis functions
GROUPING SETS
,CUBE
andROLLUP
(Andrew Gierth, Atri Sharma)
为什么?因为 ROLLUP
会简化我建议的查询。相关回答:
第一个 alpha 版本已于 2015 年 7 月 2 日发布。The expected timeline for the release:
This is the alpha release of version 9.5, indicating that some changes to features are still possible before release. The PostgreSQL Project will release 9.5 beta 1 in August, and then periodically release additional betas as required for testing until the final release in late 2015.
基础知识
当然,一定不要忽视基础知识: