如何使用多个 CASE 语句优化 SELECT 查询?
How to optimize SELECT query with multiple CASE statements?
在双人游戏中,PostgreSQL based game 最常调用的语句是 SELECT 查询 returning 用户正在玩的游戏列表:
(请原谅截图中的非拉丁字母)
CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
RETURNS TABLE (
out_gid integer,
out_created integer,
out_finished integer,
out_letters varchar[15][15],
out_values integer[15][15],
out_bid integer,
out_last_tiles jsonb,
out_last_score integer,
out_player1 integer,
out_player2 integer,
out_played1 integer,
out_played2 integer,
out_hand1 text,
out_hand2 text,
out_score1 integer,
out_score2 integer,
out_female1 integer,
out_female2 integer,
out_given1 varchar,
out_given2 varchar,
out_photo1 varchar,
out_photo2 varchar,
out_place1 varchar,
out_place2 varchar
) AS
$func$
SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.letters,
g.values,
g.bid,
m.tiles,
m.score,
/* HOW TO OPTIMIZE THE FOLLOWING CASE STATEMENTS? */
CASE WHEN g.player1 = in_uid THEN g.player1 ELSE g.player2 END,
CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END,
CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
CASE WHEN g.player1 = in_uid THEN g.score2 ELSE g.score1 END,
CASE WHEN g.player1 = in_uid THEN s1.female ELSE s2.female END,
CASE WHEN g.player1 = in_uid THEN s2.female ELSE s1.female END,
CASE WHEN g.player1 = in_uid THEN s1.given ELSE s2.given END,
CASE WHEN g.player1 = in_uid THEN s2.given ELSE s1.given END,
CASE WHEN g.player1 = in_uid THEN s1.photo ELSE s2.photo END,
CASE WHEN g.player1 = in_uid THEN s2.photo ELSE s1.photo END,
CASE WHEN g.player1 = in_uid THEN s1.place ELSE s2.place END,
CASE WHEN g.player1 = in_uid THEN s2.place ELSE s1.place END,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played1 ELSE g.played2 END)::int,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played2 ELSE g.played1 END)::int,
ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand1 ELSE g.hand2 END, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand2 ELSE g.hand1 END, ''), '.', '?', 'g'),
FROM words_games g
LEFT JOIN words_moves m ON m.gid = g.gid
-- find move record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_moves m2
WHERE m2.gid = m.gid
AND m2.played > m.played)
LEFT JOIN words_social s1 ON s1.uid = g.player1
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s1.uid = s.uid
AND s.stamp > s1.stamp)
LEFT JOIN words_social s2 ON s2.uid = g.player2
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s2.uid = s.uid
AND s.stamp > s2.stamp)
WHERE in_uid IN (g.player1, g.player2)
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 day');
$func$ LANGUAGE sql;
正如您在上面的自定义 SQL 函数中看到的那样,为了始终 return 用户数据为 player1
、given1
、score1
我使用了大量的 CASE 语句(以便在需要时可以交换获取的列):
CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
我的问题是:是否可以优化上述 SELECT 查询(无需切换到较慢的 PL/pgSQL)?
更新:
the mailing list 的 Geoff 提供了一个很好的建议,在加入时已经使用 CASE:
SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.letters,
g.values,
g.bid,
m.tiles,
m.score,
CASE WHEN g.player1 = in_uid THEN g.player1 ELSE g.player2 END,
CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END,
CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
CASE WHEN g.player1 = in_uid THEN g.score2 ELSE g.score1 END,
s1.female,
s2.female,
s1.given,
s2.given,
s1.photo,
s2.photo,
s1.place,
s2.place,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played1 ELSE g.played2 END)::int,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played2 ELSE g.played1 END)::int,
ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand1 ELSE g.hand2 END, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand2 ELSE g.hand1 END, ''), '.', '?', 'g')
FROM words_games g
LEFT JOIN words_moves m ON m.gid = g.gid
-- find move record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_moves m2
WHERE m2.gid = m.gid
AND m2.played > m.played)
LEFT JOIN words_social s1 ON s1.uid = in_uid
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s1.uid = s.uid
AND s.stamp > s1.stamp)
LEFT JOIN words_social s2 ON s2.uid = (CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END)
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s2.uid = s.uid
AND s.stamp > s2.stamp)
WHERE in_uid IN (g.player1, g.player2)
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 day');
由于您担心许多 case 语句并且它总是相同的条件,您可以将此条件拉出并有 两个 select,例如
select ...
g.player1, g.player2,
extract(epoch from g.played1)::int, extract(epoch from g.played2)::int,
...
g.score1, g.score2,
...
和另一个(相同的)select 交换了列
select ...
g.player2, g.player1,
extract(epoch from g.played2)::int, extract(epoch from g.played1)::int,
...
g.score2, g.score1,
...
不过,正如@joop 和@jarlh 已经质疑的那样,首先测试一下,这是否真的是一个性能问题。
lateral
和 distinct on
(IMO) 有助于提高可读性。 distinct on
也会对性能产生影响,但我无法猜测是正面还是负面。
select
g.gid,
extract(epoch from g.created)::int created,
extract(epoch from g.finished)::int finished,
g.letters,
g.values,
g.bid,
m.tiles,
m.score,
r.*
from
words_games g
left join (
select distinct on (gid, played) *
from words_moves
order by gid, played desc
) words_moves m on m.gid = g.gid
left join (
select distinct on (uid, stamp) *
from words_social
order by uid, stamp desc
) words_social s1 on s1.uid = g.player1
left join (
select distinct on (uid, stamp) *
from words_social
order by uid, stamp desc
) words_social s2 on s2.uid = g.player2
cross join lateral (
select
g.player1, g.player2,
extract(epoch from g.player1)::int, extract(epoch from g.player2)::int,
array_to_string(g.hand1, ''),
regexp_replace(array_to_string(g.hand2, ''), '.', '?', 'g'),
g.score1, g.score2,
s1.female, s2.female,
s1.given, s2.given,
s1.photo, s2.photo,
s1.place, s2.place
where g.player1 = in_uid
union all
select
g.player2, g.player1,
extract(epoch from g.player2)::int, extract(epoch from g.player1)::int,
array_to_string(g.hand2, ''),
regexp_replace(array_to_string(g.hand1, ''), '.', '?', 'g'),
g.score2, g.score1,
s2.female, s1.female,
s2.given, s1.given,
s2.photo, s1.photo,
s2.place, s1.place
where g.player1 != in_uid
) r
where
in_uid in (g.player1, g.player2)
and (g.finished is null or g.finished > current_timestamp - interval '1 day')
在双人游戏中,PostgreSQL based game 最常调用的语句是 SELECT 查询 returning 用户正在玩的游戏列表:
(请原谅截图中的非拉丁字母)
CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
RETURNS TABLE (
out_gid integer,
out_created integer,
out_finished integer,
out_letters varchar[15][15],
out_values integer[15][15],
out_bid integer,
out_last_tiles jsonb,
out_last_score integer,
out_player1 integer,
out_player2 integer,
out_played1 integer,
out_played2 integer,
out_hand1 text,
out_hand2 text,
out_score1 integer,
out_score2 integer,
out_female1 integer,
out_female2 integer,
out_given1 varchar,
out_given2 varchar,
out_photo1 varchar,
out_photo2 varchar,
out_place1 varchar,
out_place2 varchar
) AS
$func$
SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.letters,
g.values,
g.bid,
m.tiles,
m.score,
/* HOW TO OPTIMIZE THE FOLLOWING CASE STATEMENTS? */
CASE WHEN g.player1 = in_uid THEN g.player1 ELSE g.player2 END,
CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END,
CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
CASE WHEN g.player1 = in_uid THEN g.score2 ELSE g.score1 END,
CASE WHEN g.player1 = in_uid THEN s1.female ELSE s2.female END,
CASE WHEN g.player1 = in_uid THEN s2.female ELSE s1.female END,
CASE WHEN g.player1 = in_uid THEN s1.given ELSE s2.given END,
CASE WHEN g.player1 = in_uid THEN s2.given ELSE s1.given END,
CASE WHEN g.player1 = in_uid THEN s1.photo ELSE s2.photo END,
CASE WHEN g.player1 = in_uid THEN s2.photo ELSE s1.photo END,
CASE WHEN g.player1 = in_uid THEN s1.place ELSE s2.place END,
CASE WHEN g.player1 = in_uid THEN s2.place ELSE s1.place END,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played1 ELSE g.played2 END)::int,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played2 ELSE g.played1 END)::int,
ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand1 ELSE g.hand2 END, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand2 ELSE g.hand1 END, ''), '.', '?', 'g'),
FROM words_games g
LEFT JOIN words_moves m ON m.gid = g.gid
-- find move record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_moves m2
WHERE m2.gid = m.gid
AND m2.played > m.played)
LEFT JOIN words_social s1 ON s1.uid = g.player1
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s1.uid = s.uid
AND s.stamp > s1.stamp)
LEFT JOIN words_social s2 ON s2.uid = g.player2
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s2.uid = s.uid
AND s.stamp > s2.stamp)
WHERE in_uid IN (g.player1, g.player2)
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 day');
$func$ LANGUAGE sql;
正如您在上面的自定义 SQL 函数中看到的那样,为了始终 return 用户数据为 player1
、given1
、score1
我使用了大量的 CASE 语句(以便在需要时可以交换获取的列):
CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
我的问题是:是否可以优化上述 SELECT 查询(无需切换到较慢的 PL/pgSQL)?
更新:
the mailing list 的 Geoff 提供了一个很好的建议,在加入时已经使用 CASE:
SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.letters,
g.values,
g.bid,
m.tiles,
m.score,
CASE WHEN g.player1 = in_uid THEN g.player1 ELSE g.player2 END,
CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END,
CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
CASE WHEN g.player1 = in_uid THEN g.score2 ELSE g.score1 END,
s1.female,
s2.female,
s1.given,
s2.given,
s1.photo,
s2.photo,
s1.place,
s2.place,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played1 ELSE g.played2 END)::int,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played2 ELSE g.played1 END)::int,
ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand1 ELSE g.hand2 END, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand2 ELSE g.hand1 END, ''), '.', '?', 'g')
FROM words_games g
LEFT JOIN words_moves m ON m.gid = g.gid
-- find move record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_moves m2
WHERE m2.gid = m.gid
AND m2.played > m.played)
LEFT JOIN words_social s1 ON s1.uid = in_uid
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s1.uid = s.uid
AND s.stamp > s1.stamp)
LEFT JOIN words_social s2 ON s2.uid = (CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END)
-- find social record with the most recent timestamp
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s2.uid = s.uid
AND s.stamp > s2.stamp)
WHERE in_uid IN (g.player1, g.player2)
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 day');
由于您担心许多 case 语句并且它总是相同的条件,您可以将此条件拉出并有 两个 select,例如
select ...
g.player1, g.player2,
extract(epoch from g.played1)::int, extract(epoch from g.played2)::int,
...
g.score1, g.score2,
...
和另一个(相同的)select 交换了列
select ...
g.player2, g.player1,
extract(epoch from g.played2)::int, extract(epoch from g.played1)::int,
...
g.score2, g.score1,
...
不过,正如@joop 和@jarlh 已经质疑的那样,首先测试一下,这是否真的是一个性能问题。
lateral
和 distinct on
(IMO) 有助于提高可读性。 distinct on
也会对性能产生影响,但我无法猜测是正面还是负面。
select
g.gid,
extract(epoch from g.created)::int created,
extract(epoch from g.finished)::int finished,
g.letters,
g.values,
g.bid,
m.tiles,
m.score,
r.*
from
words_games g
left join (
select distinct on (gid, played) *
from words_moves
order by gid, played desc
) words_moves m on m.gid = g.gid
left join (
select distinct on (uid, stamp) *
from words_social
order by uid, stamp desc
) words_social s1 on s1.uid = g.player1
left join (
select distinct on (uid, stamp) *
from words_social
order by uid, stamp desc
) words_social s2 on s2.uid = g.player2
cross join lateral (
select
g.player1, g.player2,
extract(epoch from g.player1)::int, extract(epoch from g.player2)::int,
array_to_string(g.hand1, ''),
regexp_replace(array_to_string(g.hand2, ''), '.', '?', 'g'),
g.score1, g.score2,
s1.female, s2.female,
s1.given, s2.given,
s1.photo, s2.photo,
s1.place, s2.place
where g.player1 = in_uid
union all
select
g.player2, g.player1,
extract(epoch from g.player2)::int, extract(epoch from g.player1)::int,
array_to_string(g.hand2, ''),
regexp_replace(array_to_string(g.hand1, ''), '.', '?', 'g'),
g.score2, g.score1,
s2.female, s1.female,
s2.given, s1.given,
s2.photo, s1.photo,
s2.place, s1.place
where g.player1 != in_uid
) r
where
in_uid in (g.player1, g.player2)
and (g.finished is null or g.finished > current_timestamp - interval '1 day')