SELECT FOR UPDATE 随着时间的推移变慢
SELECT FOR UPDATE becomes slow with time
我们有一个包含 1B 个条目的 table,并且有 4 个进程同时处理它。他们一次声明具有 1000 行会话 ID 的行,然后在 10,000 行之后更新 table。用于声明的查询是:
EXPLAIN (ANALYZE,BUFFERS) WITH b AS
(
SELECT
userid,
address
FROM
UserEvents
WHERE
deliveryId = 2108625
AND
(
tsExpire > GetDate()
OR tsExpire IS NULL
)
AND sendTime <= GetDate()
AND session_id = 0
AND level IN
(
'default'
)
ORDER BY
sendTime FOR
UPDATE
SKIP LOCKED LIMIT 1000
)
UPDATE
UserEvents e
SET
session_id = 1
FROM
b
WHERE
e.userid = b.userid RETURNING b.userid,
b.address
当所有 4 个进程同时 运行ning 时,此查询通常在 500 毫秒内 运行s。突然间,在过去的 运行 秒中,它随着时间的推移明显变慢了。以下是解释计划:
"Update on UserEvents e (cost=5753.03..8567.46 rows=1000 width=1962) (actual time=1373.284..1422.244 rows=1000 loops=1)"
" Buffers: shared hit=1146920 read=59 dirtied=194"
" I/O Timings: read=13.916"
" CTE b"
" -> Limit (cost=0.56..5752.46 rows=1000 width=82) (actual time=1373.094..1380.853 rows=1000 loops=1)"
" Buffers: shared hit=1121721 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> LockRows (cost=0.56..179683.91 rows=31239 width=82) (actual time=1373.093..1380.775 rows=1000 loops=1)"
" Buffers: shared hit=1121721 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> Index Scan using UserEvents_nextpass2 on UserEvents (cost=0.56..179371.52 rows=31239 width=82) (actual time=1366.046..1373.339 rows=4186 loops=1)"
" Index Cond: ((deliveryId = 2108625) AND (sendTime <= '2020-04-15 08:33:57.372282+00'::timestamp with time zone))"
" Filter: (((tsexpire > '2020-04-15 08:33:57.372282+00'::timestamp with time zone) OR (tsexpire IS NULL)) AND (session_id = 0) AND ((level)::text = 'default'::text))"
" Rows Removed by Filter: 29614"
" Buffers: shared hit=1113493 read=27"
" I/O Timings: read=3.440"
" -> Nested Loop (cost=0.58..2815.00 rows=1000 width=1962) (actual time=1373.218..1389.995 rows=1000 loops=1)"
" Buffers: shared hit=1126728 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> CTE Scan on b (cost=0.00..20.00 rows=1000 width=1692) (actual time=1373.106..1382.263 rows=1000 loops=1)"
" Buffers: shared hit=1121721 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> Index Scan using UserEvents_id on UserEvents e (cost=0.58..2.79 rows=1 width=268) (actual time=0.007..0.007 rows=1 loops=1000)"
" Index Cond: (userid = b.userid)"
" Buffers: shared hit=5007"
"Planning Time: 0.331 ms"
"Execution Time: 1422.457 ms"
令人惊讶的是,在调用此查询几千次后,UserEvents_nextpass2
上的索引扫描速度显着降低。这是非空 sendTime
值的部分索引。 sendTime
在每个进程更新行并删除它们的会话 ID 后更新。但是最近的 1B 事件就是这种情况,这种缓慢的原因可能是什么?任何帮助将不胜感激。
解释相对更快 运行 700 毫秒的计划:
"Update on UserEvents e (cost=5707.45..8521.87 rows=1000 width=1962) (actual time=695.897..751.557 rows=1000 loops=1)"
" Buffers: shared hit=605921 read=68 dirtied=64"
" I/O Timings: read=27.139"
" CTE b"
" -> Limit (cost=0.56..5706.87 rows=1000 width=82) (actual time=695.616..707.835 rows=1000 loops=1)"
" Buffers: shared hit=580158 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> LockRows (cost=0.56..179686.41 rows=31489 width=82) (actual time=695.615..707.770 rows=1000 loops=1)"
" Buffers: shared hit=580158 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> Index Scan using UserEvents_nextpass2 on UserEvents (cost=0.56..179371.52 rows=31489 width=82) (actual time=691.529..704.076 rows=3000 loops=1)"
" Index Cond: ((deliveryId = 2108625) AND (sendTime <= '2020-04-15 07:42:42.856859+00'::timestamp with time zone))"
" Filter: (((tsexpire > '2020-04-15 07:42:42.856859+00'::timestamp with time zone) OR (tsexpire IS NULL)) AND (session_id = 0) AND ((level)::text = 'default'::text))"
" Rows Removed by Filter: 29722"
" Buffers: shared hit=573158 read=33"
" I/O Timings: read=10.491"
" -> Nested Loop (cost=0.58..2815.00 rows=1000 width=1962) (actual time=695.658..716.356 rows=1000 loops=1)"
" Buffers: shared hit=585165 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> CTE Scan on b (cost=0.00..20.00 rows=1000 width=1692) (actual time=695.628..709.116 rows=1000 loops=1)"
" Buffers: shared hit=580158 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> Index Scan using UserEvents_id on UserEvents e (cost=0.58..2.79 rows=1 width=268) (actual time=0.007..0.007 rows=1 loops=1000)"
" Index Cond: (userid = b.userid)"
" Buffers: shared hit=5007"
"Planning Time: 0.584 ms"
"Execution Time: 751.713 ms"
我对这个 table 的索引是:
CREATE INDEX UserEvents_nextpass2 ON public.UserEvents USING btree (deliveryid ASC NULLS LAST, sendTime ASC NULLS LAST) WHERE sendTime IS NOT NULL;
Index Scan using UserEvents_nextpass2 on UserEvents (cost=0.56..179371.52 rows=31239 width=82) (actual time=1366.046..1373.339 rows=4186 loops=1)"
Buffers: shared hit=1113493 read=27"
看来 "UserEvents_nextpass2" 索引中有很多过时的数据。为返回的每一行访问 266 页有点荒谬。您是否有任何长期打开的交易会阻止 VACUUM 和特定于 btree 的微真空有效地完成它们的工作?
查看pg_stat_activity。还有,hotstandby_feedback 开了吗? vacuum_defer_cleanup_age 不为零吗?
没有简单的方法来减少每行访问的页面,因为我的所有索引列都在同时更新。因为我的过滤器丢弃了大约 80% 的行,所以我决定将我的过滤器行添加到多列索引中。所以我的索引从:
CREATE INDEX UserEvents_nextpass2
ON public.UserEvents USING btree (deliveryid ASC NULLS LAST, sendTime ASC NULLS LAST)
WHERE sendTime IS NOT NULL;
至:
CREATE INDEX UserEvents_nextpass2
ON public.UserEvents USING btree (deliveryid ASC NULLS LAST, sendTime ASC NULLS LAST, tsexpired, session_id, level)
WHERE sendTime IS NOT NULL;
这将我过滤删除的行减少到 0,我只访问了我实际需要的行。我的缓冲区命中大小从 1,121,721 减少到 <100,000。查询时间从 1.5 秒下降到 200 毫秒。
经验教训:
Always prefer a multi-column index over filtering
我们有一个包含 1B 个条目的 table,并且有 4 个进程同时处理它。他们一次声明具有 1000 行会话 ID 的行,然后在 10,000 行之后更新 table。用于声明的查询是:
EXPLAIN (ANALYZE,BUFFERS) WITH b AS
(
SELECT
userid,
address
FROM
UserEvents
WHERE
deliveryId = 2108625
AND
(
tsExpire > GetDate()
OR tsExpire IS NULL
)
AND sendTime <= GetDate()
AND session_id = 0
AND level IN
(
'default'
)
ORDER BY
sendTime FOR
UPDATE
SKIP LOCKED LIMIT 1000
)
UPDATE
UserEvents e
SET
session_id = 1
FROM
b
WHERE
e.userid = b.userid RETURNING b.userid,
b.address
当所有 4 个进程同时 运行ning 时,此查询通常在 500 毫秒内 运行s。突然间,在过去的 运行 秒中,它随着时间的推移明显变慢了。以下是解释计划:
"Update on UserEvents e (cost=5753.03..8567.46 rows=1000 width=1962) (actual time=1373.284..1422.244 rows=1000 loops=1)"
" Buffers: shared hit=1146920 read=59 dirtied=194"
" I/O Timings: read=13.916"
" CTE b"
" -> Limit (cost=0.56..5752.46 rows=1000 width=82) (actual time=1373.094..1380.853 rows=1000 loops=1)"
" Buffers: shared hit=1121721 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> LockRows (cost=0.56..179683.91 rows=31239 width=82) (actual time=1373.093..1380.775 rows=1000 loops=1)"
" Buffers: shared hit=1121721 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> Index Scan using UserEvents_nextpass2 on UserEvents (cost=0.56..179371.52 rows=31239 width=82) (actual time=1366.046..1373.339 rows=4186 loops=1)"
" Index Cond: ((deliveryId = 2108625) AND (sendTime <= '2020-04-15 08:33:57.372282+00'::timestamp with time zone))"
" Filter: (((tsexpire > '2020-04-15 08:33:57.372282+00'::timestamp with time zone) OR (tsexpire IS NULL)) AND (session_id = 0) AND ((level)::text = 'default'::text))"
" Rows Removed by Filter: 29614"
" Buffers: shared hit=1113493 read=27"
" I/O Timings: read=3.440"
" -> Nested Loop (cost=0.58..2815.00 rows=1000 width=1962) (actual time=1373.218..1389.995 rows=1000 loops=1)"
" Buffers: shared hit=1126728 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> CTE Scan on b (cost=0.00..20.00 rows=1000 width=1692) (actual time=1373.106..1382.263 rows=1000 loops=1)"
" Buffers: shared hit=1121721 read=27 dirtied=23"
" I/O Timings: read=3.440"
" -> Index Scan using UserEvents_id on UserEvents e (cost=0.58..2.79 rows=1 width=268) (actual time=0.007..0.007 rows=1 loops=1000)"
" Index Cond: (userid = b.userid)"
" Buffers: shared hit=5007"
"Planning Time: 0.331 ms"
"Execution Time: 1422.457 ms"
令人惊讶的是,在调用此查询几千次后,UserEvents_nextpass2
上的索引扫描速度显着降低。这是非空 sendTime
值的部分索引。 sendTime
在每个进程更新行并删除它们的会话 ID 后更新。但是最近的 1B 事件就是这种情况,这种缓慢的原因可能是什么?任何帮助将不胜感激。
解释相对更快 运行 700 毫秒的计划:
"Update on UserEvents e (cost=5707.45..8521.87 rows=1000 width=1962) (actual time=695.897..751.557 rows=1000 loops=1)"
" Buffers: shared hit=605921 read=68 dirtied=64"
" I/O Timings: read=27.139"
" CTE b"
" -> Limit (cost=0.56..5706.87 rows=1000 width=82) (actual time=695.616..707.835 rows=1000 loops=1)"
" Buffers: shared hit=580158 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> LockRows (cost=0.56..179686.41 rows=31489 width=82) (actual time=695.615..707.770 rows=1000 loops=1)"
" Buffers: shared hit=580158 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> Index Scan using UserEvents_nextpass2 on UserEvents (cost=0.56..179371.52 rows=31489 width=82) (actual time=691.529..704.076 rows=3000 loops=1)"
" Index Cond: ((deliveryId = 2108625) AND (sendTime <= '2020-04-15 07:42:42.856859+00'::timestamp with time zone))"
" Filter: (((tsexpire > '2020-04-15 07:42:42.856859+00'::timestamp with time zone) OR (tsexpire IS NULL)) AND (session_id = 0) AND ((level)::text = 'default'::text))"
" Rows Removed by Filter: 29722"
" Buffers: shared hit=573158 read=33"
" I/O Timings: read=10.491"
" -> Nested Loop (cost=0.58..2815.00 rows=1000 width=1962) (actual time=695.658..716.356 rows=1000 loops=1)"
" Buffers: shared hit=585165 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> CTE Scan on b (cost=0.00..20.00 rows=1000 width=1692) (actual time=695.628..709.116 rows=1000 loops=1)"
" Buffers: shared hit=580158 read=33 dirtied=29"
" I/O Timings: read=10.491"
" -> Index Scan using UserEvents_id on UserEvents e (cost=0.58..2.79 rows=1 width=268) (actual time=0.007..0.007 rows=1 loops=1000)"
" Index Cond: (userid = b.userid)"
" Buffers: shared hit=5007"
"Planning Time: 0.584 ms"
"Execution Time: 751.713 ms"
我对这个 table 的索引是:
CREATE INDEX UserEvents_nextpass2 ON public.UserEvents USING btree (deliveryid ASC NULLS LAST, sendTime ASC NULLS LAST) WHERE sendTime IS NOT NULL;
Index Scan using UserEvents_nextpass2 on UserEvents (cost=0.56..179371.52 rows=31239 width=82) (actual time=1366.046..1373.339 rows=4186 loops=1)"
Buffers: shared hit=1113493 read=27"
看来 "UserEvents_nextpass2" 索引中有很多过时的数据。为返回的每一行访问 266 页有点荒谬。您是否有任何长期打开的交易会阻止 VACUUM 和特定于 btree 的微真空有效地完成它们的工作?
查看pg_stat_activity。还有,hotstandby_feedback 开了吗? vacuum_defer_cleanup_age 不为零吗?
没有简单的方法来减少每行访问的页面,因为我的所有索引列都在同时更新。因为我的过滤器丢弃了大约 80% 的行,所以我决定将我的过滤器行添加到多列索引中。所以我的索引从:
CREATE INDEX UserEvents_nextpass2
ON public.UserEvents USING btree (deliveryid ASC NULLS LAST, sendTime ASC NULLS LAST)
WHERE sendTime IS NOT NULL;
至:
CREATE INDEX UserEvents_nextpass2
ON public.UserEvents USING btree (deliveryid ASC NULLS LAST, sendTime ASC NULLS LAST, tsexpired, session_id, level)
WHERE sendTime IS NOT NULL;
这将我过滤删除的行减少到 0,我只访问了我实际需要的行。我的缓冲区命中大小从 1,121,721 减少到 <100,000。查询时间从 1.5 秒下降到 200 毫秒。
经验教训:
Always prefer a multi-column index over filtering