使用 postgres tablefunc crosstab() 计算错误答案
Using postgres tablefunc crosstab() to count incorrect answers
我有一个视图(我们称之为 'fruit'),其中有一列正确答案和相关的多项选择题错误答案,我想计算哪些错误答案被选得最频繁(容易混淆的)。视图看起来像这样:
correct_answer | wrong_response
-------------------------------
apple | pear
apple | pear
apple | banana
banana | apple
banana | pear
banana | pear
banana | pear
pear | apple
我想要的是一个枢轴 table,它计算相对于正确答案的错误响应,这样列代表正确答案,行代表错误答案的计数。
wrong_response | apple | banana | pear
---------------------------------------
apple | 0 | 1 | 1
banana | 1 | 0 | 0
pear | 2 | 3 | 0
我一直在这里 使用此功能,但当时我并没有尝试计算东西。任何帮助将不胜感激!
编辑:对于未来的人们,这两种解决方案都有效!然而,条件聚合更灵活。交叉表解决方案仅在您完全具有查询中的所有可能性时才有效。例如,如果您排除 pear(或添加 kiwi),则交叉表解决方案 returns 会出错。无论您是否排除(或添加当前不存在的)记录,条件聚合 returns 都会产生结果。谢谢大家的帮助。
如果您知道可以使用条件聚合的列:
select wrong_response,
count(*) filter (where correct_answer = 'apple') as apple,
count(*) filter (where correct_answer = 'pear') as pear,
count(*) filter (where correct_answer = 'banana') as banana
from t
group by wrong_response;
假设您已经完成:CREATE EXTENSION tablefunc;
那么通过crosstab()函数实现你想要的就是:
SELECT *
FROM crosstab('SELECT wrong_response,
correct_answer,
count(*)
FROM fruit
GROUP BY wrong_response, correct_answer
ORDER BY wrong_response',
'SELECT correct_answer
FROM fruit
GROUP BY correct_answer
ORDER BY correct_answer')
AS (wrong_answer varchar(20),
apple bigint,
banana bigint,
pear bigint);
上面的代码会给你下面的结果,这就是你想要的:
注意这里的0输出为null,为了得到你想要的,你只需要稍微修改一下select
:
SELECT
wrong_answer,
coalesce(apple, 0) as apple,
coalesce(banana, 0) as banana,
coalesce(pear, 0) as pear
FROM crosstab('SELECT wrong_response,
correct_answer,
count(*)
FROM fruit
GROUP BY wrong_response, correct_answer
ORDER BY wrong_response',
'SELECT correct_answer
FROM fruit
GROUP BY correct_answer
ORDER BY correct_answer')
AS (wrong_answer varchar(20),
apple bigint,
banana bigint,
pear bigint)
上面的内容会让你得到你想要的:
我有一个视图(我们称之为 'fruit'),其中有一列正确答案和相关的多项选择题错误答案,我想计算哪些错误答案被选得最频繁(容易混淆的)。视图看起来像这样:
correct_answer | wrong_response
-------------------------------
apple | pear
apple | pear
apple | banana
banana | apple
banana | pear
banana | pear
banana | pear
pear | apple
我想要的是一个枢轴 table,它计算相对于正确答案的错误响应,这样列代表正确答案,行代表错误答案的计数。
wrong_response | apple | banana | pear
---------------------------------------
apple | 0 | 1 | 1
banana | 1 | 0 | 0
pear | 2 | 3 | 0
我一直在这里
编辑:对于未来的人们,这两种解决方案都有效!然而,条件聚合更灵活。交叉表解决方案仅在您完全具有查询中的所有可能性时才有效。例如,如果您排除 pear(或添加 kiwi),则交叉表解决方案 returns 会出错。无论您是否排除(或添加当前不存在的)记录,条件聚合 returns 都会产生结果。谢谢大家的帮助。
如果您知道可以使用条件聚合的列:
select wrong_response,
count(*) filter (where correct_answer = 'apple') as apple,
count(*) filter (where correct_answer = 'pear') as pear,
count(*) filter (where correct_answer = 'banana') as banana
from t
group by wrong_response;
假设您已经完成:CREATE EXTENSION tablefunc;
那么通过crosstab()函数实现你想要的就是:
SELECT *
FROM crosstab('SELECT wrong_response,
correct_answer,
count(*)
FROM fruit
GROUP BY wrong_response, correct_answer
ORDER BY wrong_response',
'SELECT correct_answer
FROM fruit
GROUP BY correct_answer
ORDER BY correct_answer')
AS (wrong_answer varchar(20),
apple bigint,
banana bigint,
pear bigint);
上面的代码会给你下面的结果,这就是你想要的:
注意这里的0输出为null,为了得到你想要的,你只需要稍微修改一下select
:
SELECT
wrong_answer,
coalesce(apple, 0) as apple,
coalesce(banana, 0) as banana,
coalesce(pear, 0) as pear
FROM crosstab('SELECT wrong_response,
correct_answer,
count(*)
FROM fruit
GROUP BY wrong_response, correct_answer
ORDER BY wrong_response',
'SELECT correct_answer
FROM fruit
GROUP BY correct_answer
ORDER BY correct_answer')
AS (wrong_answer varchar(20),
apple bigint,
banana bigint,
pear bigint)
上面的内容会让你得到你想要的: