SQL 中 2 列的百分比差异

% diff of 2 columns in SQL

我有 2 个 tables - 国家 (id, name, continent) 和 population_years (id, population, year, country_id)。数据从 2000 年到 2010 年,我正在尝试计算这段时间每个大陆平均人口的百分比差异。我试图通过创建一个临时的 table 来实现它,它会产生以下输出:

但是当我尝试计算 % diff 时(您可以从我下面的代码中看到),我不知道如何将代码中的 'avg pop 2000' 和 'avg pop 2010' 列引用为他们没有被分配一个我可以参考的变量。在代码中,我使用 avg_pop_2010 和 avg_pop_2000 来引用这些列 - 显然这实际上不起作用。

WITH avg_pop AS( SELECT countries.continent, 
ROUND(AVG(CASE WHEN population_years.year = 2000 THEN population_years.population END), 2) as 'avg pop 2000',
ROUND(AVG(CASE WHEN population_years.year = 2010 THEN population_years.population END), 2) as 'avg pop 2010'
FROM countries 
JOIN population_years 
WHERE population_years.country_id = countries.id
GROUP BY 1)

SELECT countries.continent, ROUND(((avg_pop_2010 - avg_pop_2000)/avg_pop_2000)*100.0, 2) AS '%diff'
FROM avg_pop;
    WITH avg_pop AS( SELECT countries.continent, 
    ROUND(AVG(CASE WHEN population_years.year = 2000 THEN population_years.population END), 2) as avg_pop_2000,
    ROUND(AVG(CASE WHEN population_years.year = 2010 THEN population_years.population END), 2) as avg_pop_2010
    FROM countries 
    JOIN population_years 
    WHERE population_years.country_id = countries.id
    GROUP BY 1)

    SELECT continent, ROUND((( avg_pop_2000 - avg_pop_2010)/avg_pop_2010)*100, 2) AS '%diff'
    FROM avg_pop;

这是一个小演示:

DEMO

在查看其他评论和答案时,我意识到您需要提供不带 '' 的不同别名,一切都会起作用...我已经更新了我的答案和演示。

另一种选择是重复表达式 - 稍作优化:

select
    c.continent,
    round(avg(case when py.year = 2000 then py.population end), 2) avg_pop_2000,
    round(avg(case when py.year = 2010 then py.population end), 2) avg_pop_2010,
    round(
        100.* avg(case when py.year = 2010 then py.population else - py.population end)
        / avg(case when py.year = 2000 then py.population end),
        2
    ) percent_diff
from countries c
inner join population_years py on py.country_id = c.id
where py.year in (2010, 2020)
group by c.continent

旁注:

  • 不要对标识符使用单引号!它们代表标准 SQL 中的文字字符串;通常,您应该更喜欢不需要引用的标识符。如果需要引号,请使用标准双引号 ("),SQLite 可以识别

  • 使用 where 子句预过滤相关年份使查询更高效

  • 使用标准连接语法;连接条件转到连接的 on 子句,而不是 where 子句

  • 四舍五入,计算百分比差异不准确;先计算,再舍入

  • table 别名使查询更易于编写和阅读