比多个 SELECT 语句更好的方法?
Better way than multiple SELECT statements?
我正在创建一个显示饼图的网络应用程序。为了在单个 HTTP 请求中从 PostgreSQL 9.3 数据库中获取图表的所有数据,我将多个 SELECT
语句与 UNION ALL
组合在一起——这里是一部分:
SELECT 'spf' as type, COUNT(*)
FROM (SELECT cai.id
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
JOIN quizzes_quiz q ON q.id = cai.activity_id
WHERE cai.end_time::date = '2015-09-12'
AND q.name != 'Exit Ticket Quiz'
AND cai.activity_type = 'QZ'
AND (cas.key = 'disable_student_nav' AND cas.value = 'True'
OR cas.key = 'pacing' AND cas.value = 'student')
GROUP BY cai.id
HAVING COUNT(cai.id) = 2) sub
UNION ALL
SELECT 'spn' as type, COUNT(*)
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-12'
AND cai.activity_type = 'QZ'
AND cas.key = 'disable_student_nav'
AND cas.value = 'False'
UNION ALL
SELECT 'tp' as type, COUNT(*)
FROM (SELECT cai.id
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-12'
AND cai.activity_type = 'QZ'
AND cas.key = 'pacing' AND cas.value = 'teacher') sub;
这会产生一个很好的小响应以发送回客户端:
type | count
------+---------
spf | 100153
spn | 96402
tp | 84211
我想知道是否可以提高我的查询效率。每个 SELECT 语句主要使用相同的 JOIN 操作。有没有办法不为每个新的 SELECT?
重复 JOIN
我实际上更喜欢单行 3 列。
或者,总的来说,是否有一些与我正在做的完全不同但更好的方法?
这是部分答案。后两个可以合并为一个查询:
SELECT (case when key = 'disable_student_nav' then 'spn'
when key = 'pacing' then 'tp'
end) as type, COUNT(*)
FROM common_activityinstance cai JOIN
common_activityinstance_settings cais
ON cai.id = cais.activityinstance_id JOIN
common_activitysetting cas
ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-12' AND cai.activity_type = 'QZ' AND
(key, value) in (('disable_student_nav', 'False'), ('pacing', 'teacher'))
GROUP BY type
不知道有没有办法把第一组放到类似的逻辑中。例如,如果 QZ
条件可以应用于所有三个组,那么添加第一组就很容易了。
您可以将 case
与每个类型的 where
子句中的条件一起使用。但是,第一个查询的 having
条件不会被这样满足。
select type, count(*) as count
from
(
SELECT cai.id,
case when q.name!= 'Exit Ticket Quiz' and key = 'disable_student_nav'
AND value = 'True' OR key = 'pacing' AND value = 'student' then 'spf'
when key = 'disable_student_nav' AND value = 'False' then 'spn'
when key = 'pacing' AND value = 'teacher' then 'tp'
end as type
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
JOIN quizzes_quiz q ON q.id = cai.activity_id
WHERE cai.end_time::date = '2015-09-12'
AND q.name != 'Exit Ticket Quiz'
AND cai.activity_type = 'QZ'
) t
group by type
没有办法使该查询更有效率,不。您可以设置视图或其他任何内容,但它总是必须 运行 通过它三次。但是您可以通过在 PHP 或 PL/SQL 或其他任何地方进行一些 post 处理来解决问题。从一个更简单的查询开始,像这样:
SELECT COUNT(*), cai.id, q.name, 键, 值
FROM common_activityinstance 蔡
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-12'
按 cai.id、q.name、键、值
分组
...根据您的解释,我不清楚这是否会产生合理数量的输出行。但假设确实如此,请编写一些代码将它们按摩成您想要的形状。
您可以将大部分成本捆绑在 CTE 中的单个主查询中,并多次重复使用结果。
这 returns 一个 单行三列 以每个 type
():
命名
WITH cte AS (
SELECT cai.id, cai.activity_id, cas.key, cas.value
FROM common_activityinstance cai
JOIN common_activityinstance_settings s ON s.activityinstance_id = cai.id
JOIN common_activitysetting cas ON cas.id = s.id
WHERE cai.end_time::date = '2015-09-12' -- problem?
AND cai.activity_type = 'QZ'
AND (cas.key = 'disable_student_nav' AND cas.value IN ('True', 'False') OR
cas.key = 'pacing' AND cas.value IN ('student', 'teacher'))
)
SELECT *
FROM (
SELECT count(*) AS spf
FROM (
SELECT c.id
FROM cte c
JOIN quizzes_quiz q ON q.id = c.activity_id
WHERE q.name <> 'Exit Ticket Quiz'
AND (c.key, c.value) IN (('disable_student_nav', 'True')
, ('pacing', 'student'))
GROUP BY 1
HAVING count(*) = 2
) sub
) spf
, (
SELECT count(key = 'disable_student_nav' AND value = 'False' OR NULL) AS spn
, count(key = 'pacing' AND value = 'teacher' OR NULL) AS tp
FROM cte
) spn_tp;
应该适用于 Postgres 9.3。在 Postgres 9.4 中,您可以使用新的聚合 FILTER
子句:
count(*) FILTER (WHERE key = 'disable_student_nav' AND value = 'False') AS spn
, count(*) FILTER (WHERE key = 'pacing' AND value = 'teacher') AS tp
两种语法变体的详细信息:
- How can I simplify this game statistics query?
标记为 problem?
的条件可能是很大的性能问题,具体取决于 cai.end_time
的数据类型。首先,它不是 sargable。而如果是timestamptz
类型,表达式很难索引,因为结果取决于session当前的时区设置——在不同的时区执行也会导致不同的结果。
比较:
- Ignoring timezones altogether in Rails and PostgreSQL
您只需命名应该定义您的日期的时区。以我在维也纳的时区为例:
WHERE cai.end_time >= '2015-09-12 0:0'::timestamp AT TIME ZONE 'Europe/Vienna'
AND cai.end_time < '2015-09-13 0:0'::timestamp AT TIME ZONE 'Europe/Vienna'
您也可以提供简单的 timestamptz
值。你甚至可以:
WHERE cai.end_time >= '2015-09-12'::date
AND cai.end_time < '2015-09-12'::date + 1
但第一个变体不依赖于当前时区设置。
上面链接有详细解释。
现在查询可以使用您的索引,如果您的 table.
中有许多不同的日子,查询应该会快得多
这只是一个完全不同方法的草图:为您需要的所有条件构造一个布尔值 "hypercube"
在你的 "crosstabulation" 中。选择或聚合子集的逻辑可以稍后完成(例如抑制exit_tickets,业务逻辑我不清楚)
SELECT DISTINCT not_exit, disabled, pacing
, COUNT(*) AS the_count
FROM (SELECT DISTINCT cai.id
, EXISTS (SELECT *
FROM quizzes_quiz q
WHERE q.id = cai.activity_id AND q.name != 'Exit Ticket Quiz'
) AS not_exit
, EXISTS ( SELECT *
FROM common_activityinstance_settings cais
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.id = cais.activityinstance_id
AND cas.key = 'disable_student_nav' AND cas.value = 'True'
) AS disabled
, EXISTS ( SELECT *
FROM common_activityinstance_settings cais
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.id = cais.activityinstance_id
AND cas.key = 'pacing' AND cas.value = 'student')
) AS pacing
FROM common_activityinstance cai
WHERE cai.end_time::date = '2015-09-12' AND cai.activity_type = 'QZ'
) my_cube
GROUP BY 1,2,3
ORDER BY 1,2,3
;
最后说明:此方法基于我的假设,即基础数据模型实际上是一个 EAV 模型,并且每个学生最多可以出现一次属性。
我正在创建一个显示饼图的网络应用程序。为了在单个 HTTP 请求中从 PostgreSQL 9.3 数据库中获取图表的所有数据,我将多个 SELECT
语句与 UNION ALL
组合在一起——这里是一部分:
SELECT 'spf' as type, COUNT(*)
FROM (SELECT cai.id
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
JOIN quizzes_quiz q ON q.id = cai.activity_id
WHERE cai.end_time::date = '2015-09-12'
AND q.name != 'Exit Ticket Quiz'
AND cai.activity_type = 'QZ'
AND (cas.key = 'disable_student_nav' AND cas.value = 'True'
OR cas.key = 'pacing' AND cas.value = 'student')
GROUP BY cai.id
HAVING COUNT(cai.id) = 2) sub
UNION ALL
SELECT 'spn' as type, COUNT(*)
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-12'
AND cai.activity_type = 'QZ'
AND cas.key = 'disable_student_nav'
AND cas.value = 'False'
UNION ALL
SELECT 'tp' as type, COUNT(*)
FROM (SELECT cai.id
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-12'
AND cai.activity_type = 'QZ'
AND cas.key = 'pacing' AND cas.value = 'teacher') sub;
这会产生一个很好的小响应以发送回客户端:
type | count
------+---------
spf | 100153
spn | 96402
tp | 84211
我想知道是否可以提高我的查询效率。每个 SELECT 语句主要使用相同的 JOIN 操作。有没有办法不为每个新的 SELECT?
重复 JOIN
我实际上更喜欢单行 3 列。
或者,总的来说,是否有一些与我正在做的完全不同但更好的方法?
这是部分答案。后两个可以合并为一个查询:
SELECT (case when key = 'disable_student_nav' then 'spn'
when key = 'pacing' then 'tp'
end) as type, COUNT(*)
FROM common_activityinstance cai JOIN
common_activityinstance_settings cais
ON cai.id = cais.activityinstance_id JOIN
common_activitysetting cas
ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-12' AND cai.activity_type = 'QZ' AND
(key, value) in (('disable_student_nav', 'False'), ('pacing', 'teacher'))
GROUP BY type
不知道有没有办法把第一组放到类似的逻辑中。例如,如果 QZ
条件可以应用于所有三个组,那么添加第一组就很容易了。
您可以将 case
与每个类型的 where
子句中的条件一起使用。但是,第一个查询的 having
条件不会被这样满足。
select type, count(*) as count
from
(
SELECT cai.id,
case when q.name!= 'Exit Ticket Quiz' and key = 'disable_student_nav'
AND value = 'True' OR key = 'pacing' AND value = 'student' then 'spf'
when key = 'disable_student_nav' AND value = 'False' then 'spn'
when key = 'pacing' AND value = 'teacher' then 'tp'
end as type
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
JOIN quizzes_quiz q ON q.id = cai.activity_id
WHERE cai.end_time::date = '2015-09-12'
AND q.name != 'Exit Ticket Quiz'
AND cai.activity_type = 'QZ'
) t
group by type
没有办法使该查询更有效率,不。您可以设置视图或其他任何内容,但它总是必须 运行 通过它三次。但是您可以通过在 PHP 或 PL/SQL 或其他任何地方进行一些 post 处理来解决问题。从一个更简单的查询开始,像这样:
SELECT COUNT(*), cai.id, q.name, 键, 值 FROM common_activityinstance 蔡 JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id JOIN common_activitysetting cas ON cas.id = cais.id WHERE cai.end_time::date = '2015-09-12' 按 cai.id、q.name、键、值
分组...根据您的解释,我不清楚这是否会产生合理数量的输出行。但假设确实如此,请编写一些代码将它们按摩成您想要的形状。
您可以将大部分成本捆绑在 CTE 中的单个主查询中,并多次重复使用结果。
这 returns 一个 单行三列 以每个 type
(
WITH cte AS (
SELECT cai.id, cai.activity_id, cas.key, cas.value
FROM common_activityinstance cai
JOIN common_activityinstance_settings s ON s.activityinstance_id = cai.id
JOIN common_activitysetting cas ON cas.id = s.id
WHERE cai.end_time::date = '2015-09-12' -- problem?
AND cai.activity_type = 'QZ'
AND (cas.key = 'disable_student_nav' AND cas.value IN ('True', 'False') OR
cas.key = 'pacing' AND cas.value IN ('student', 'teacher'))
)
SELECT *
FROM (
SELECT count(*) AS spf
FROM (
SELECT c.id
FROM cte c
JOIN quizzes_quiz q ON q.id = c.activity_id
WHERE q.name <> 'Exit Ticket Quiz'
AND (c.key, c.value) IN (('disable_student_nav', 'True')
, ('pacing', 'student'))
GROUP BY 1
HAVING count(*) = 2
) sub
) spf
, (
SELECT count(key = 'disable_student_nav' AND value = 'False' OR NULL) AS spn
, count(key = 'pacing' AND value = 'teacher' OR NULL) AS tp
FROM cte
) spn_tp;
应该适用于 Postgres 9.3。在 Postgres 9.4 中,您可以使用新的聚合 FILTER
子句:
count(*) FILTER (WHERE key = 'disable_student_nav' AND value = 'False') AS spn
, count(*) FILTER (WHERE key = 'pacing' AND value = 'teacher') AS tp
两种语法变体的详细信息:
- How can I simplify this game statistics query?
标记为 problem?
的条件可能是很大的性能问题,具体取决于 cai.end_time
的数据类型。首先,它不是 sargable。而如果是timestamptz
类型,表达式很难索引,因为结果取决于session当前的时区设置——在不同的时区执行也会导致不同的结果。
比较:
- Ignoring timezones altogether in Rails and PostgreSQL
您只需命名应该定义您的日期的时区。以我在维也纳的时区为例:
WHERE cai.end_time >= '2015-09-12 0:0'::timestamp AT TIME ZONE 'Europe/Vienna'
AND cai.end_time < '2015-09-13 0:0'::timestamp AT TIME ZONE 'Europe/Vienna'
您也可以提供简单的 timestamptz
值。你甚至可以:
WHERE cai.end_time >= '2015-09-12'::date
AND cai.end_time < '2015-09-12'::date + 1
但第一个变体不依赖于当前时区设置。
上面链接有详细解释。
现在查询可以使用您的索引,如果您的 table.
中有许多不同的日子,查询应该会快得多这只是一个完全不同方法的草图:为您需要的所有条件构造一个布尔值 "hypercube" 在你的 "crosstabulation" 中。选择或聚合子集的逻辑可以稍后完成(例如抑制exit_tickets,业务逻辑我不清楚)
SELECT DISTINCT not_exit, disabled, pacing
, COUNT(*) AS the_count
FROM (SELECT DISTINCT cai.id
, EXISTS (SELECT *
FROM quizzes_quiz q
WHERE q.id = cai.activity_id AND q.name != 'Exit Ticket Quiz'
) AS not_exit
, EXISTS ( SELECT *
FROM common_activityinstance_settings cais
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.id = cais.activityinstance_id
AND cas.key = 'disable_student_nav' AND cas.value = 'True'
) AS disabled
, EXISTS ( SELECT *
FROM common_activityinstance_settings cais
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.id = cais.activityinstance_id
AND cas.key = 'pacing' AND cas.value = 'student')
) AS pacing
FROM common_activityinstance cai
WHERE cai.end_time::date = '2015-09-12' AND cai.activity_type = 'QZ'
) my_cube
GROUP BY 1,2,3
ORDER BY 1,2,3
;
最后说明:此方法基于我的假设,即基础数据模型实际上是一个 EAV 模型,并且每个学生最多可以出现一次属性。