使用 postgres tablefunc crosstab() 计算错误答案

Using postgres tablefunc crosstab() to count incorrect answers

我有一个视图(我们称之为 'fruit'),其中有一列正确答案和相关的多项选择题错误答案,我想计算哪些错误答案被选得最频繁(容易混淆的)。视图看起来像这样:

correct_answer | wrong_response
-------------------------------
apple          |   pear
apple          |   pear
apple          |   banana
banana         |   apple
banana         |   pear
banana         |   pear
banana         |   pear
pear           |   apple

我想要的是一个枢轴 table,它计算相对于正确答案的错误响应,这样列代表正确答案,行代表错误答案的计数。

wrong_response | apple | banana | pear
---------------------------------------
apple          | 0     | 1      | 1
banana         | 1     | 0      | 0
pear           | 2     | 3      | 0

我一直在这里 使用此功能,但当时我并没有尝试计算东西。任何帮助将不胜感激!

编辑:对于未来的人们,这两种解决方案都有效!然而,条件聚合更灵活。交叉表解决方案仅在您完全具有查询中的所有可能性时才有效。例如,如果您排除 pear(或添加 kiwi),则交叉表解决方案 returns 会出错。无论您是否排除(或添加当前不存在的)记录,条件聚合 returns 都会产生结果。谢谢大家的帮助。

如果您知道可以使用条件聚合的列:

select wrong_response,
       count(*) filter (where correct_answer = 'apple') as apple,
       count(*) filter (where correct_answer = 'pear') as pear,
       count(*) filter (where correct_answer = 'banana') as banana
from t
group by wrong_response;

假设您已经完成:CREATE EXTENSION tablefunc;

那么通过crosstab()函数实现你想要的就是:

SELECT *
FROM crosstab('SELECT wrong_response,
                      correct_answer,
                      count(*)
               FROM fruit
               GROUP BY wrong_response, correct_answer 
               ORDER BY wrong_response',

              'SELECT correct_answer
               FROM fruit
               GROUP BY correct_answer
               ORDER BY correct_answer')

AS (wrong_answer varchar(20),
    apple bigint,
    banana bigint,
    pear bigint);

上面的代码会给你下面的结果,这就是你想要的:

注意这里的0输出为null,为了得到你想要的,你只需要稍微修改一下select

SELECT
    wrong_answer,
    coalesce(apple, 0) as apple,
    coalesce(banana, 0) as banana,
    coalesce(pear, 0) as pear
FROM crosstab('SELECT wrong_response,
                      correct_answer,
                      count(*)
               FROM fruit
               GROUP BY wrong_response, correct_answer 
               ORDER BY wrong_response',

              'SELECT correct_answer
               FROM fruit
               GROUP BY correct_answer
               ORDER BY correct_answer')

AS (wrong_answer varchar(20),
    apple bigint,
    banana bigint,
    pear bigint)

上面的内容会让你得到你想要的: