PostgreSQL:SELECT 非 DISTINCT 行数

PostgreSQL: SELECT count of rows that are not DISTINCT

我使用的是 PostgreSQL 9.3,我遇到了这个又大又难看的查询...

SELECT cai.id
FROM common_activityinstance cai
JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
JOIN common_activitysetting cas ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-11'
    AND (   key = 'disable_student_nav' AND value = 'True'
         OR key = 'pacing' AND value = 'student');

...这给了我这个结果...

    id  
  ------
   1352
   1352
   1353
   1353
   1354
   1355
 (6 rows)

如何改进我的查询以获取重复行的计数(本例中为 2 个)?

使用子查询

select count(*) total_dups from(
    SELECT count(cai.id)
    FROM common_activityinstance cai
    JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
    JOIN common_activitysetting cas ON cas.id = cais.id
    WHERE cai.end_time::date = '2015-09-11'
        AND (key = 'disable_student_nav'
                AND value = 'True'
                OR key = 'pacing'
                AND value = 'student')
    group by cai.id having count(cai.id) >1
    ) t

group by cai.id having count(cai.id) > 1可用于找出每个cai.id的重复计数,然后SELECT count(cai.id)(select ...)t可用于找出子中所有重复的计数-查询.

使用CTE

with cte as (
SELECT count(cai.id)
    FROM common_activityinstance cai
    JOIN common_activityinstance_settings cais ON cai.id = cais.activityinstance_id
    JOIN common_activitysetting cas ON cas.id = cais.id
    WHERE cai.end_time::date = '2015-09-11'
        AND (key = 'disable_student_nav'
                AND value = 'True'
                OR key = 'pacing'
                AND value = 'student')
    group by cai.id having count(cai.id) >1
    )

    select count(*) from  cte

Difference between CTE and SubQuery?

由于查询的结构,我怀疑重复项可能仅来自查询的 or 部分。如果限制最多重复两次,可以不用子查询进行计算:

SELECT count(cai.id) - count(distinct cai.id)
FROM common_activityinstance cai JOIN
     common_activityinstance_settings cais
     ON cai.id = cais.activityinstance_id JOIN
     common_activitysetting cas
     ON cas.id = cais.id
WHERE cai.end_time::date = '2015-09-11' AND
      (key, value) IN (('disable_student_nav', 'True'), ('pacing', 'student'));

注意:这只适用于每个id只出现一次或两次的特殊情况。