SQL 中 2 列的百分比差异
% diff of 2 columns in SQL
我有 2 个 tables - 国家 (id, name, continent) 和 population_years (id, population, year, country_id)。数据从 2000 年到 2010 年,我正在尝试计算这段时间每个大陆平均人口的百分比差异。我试图通过创建一个临时的 table 来实现它,它会产生以下输出:
但是当我尝试计算 % diff 时(您可以从我下面的代码中看到),我不知道如何将代码中的 'avg pop 2000' 和 'avg pop 2010' 列引用为他们没有被分配一个我可以参考的变量。在代码中,我使用 avg_pop_2010 和 avg_pop_2000 来引用这些列 - 显然这实际上不起作用。
WITH avg_pop AS( SELECT countries.continent,
ROUND(AVG(CASE WHEN population_years.year = 2000 THEN population_years.population END), 2) as 'avg pop 2000',
ROUND(AVG(CASE WHEN population_years.year = 2010 THEN population_years.population END), 2) as 'avg pop 2010'
FROM countries
JOIN population_years
WHERE population_years.country_id = countries.id
GROUP BY 1)
SELECT countries.continent, ROUND(((avg_pop_2010 - avg_pop_2000)/avg_pop_2000)*100.0, 2) AS '%diff'
FROM avg_pop;
WITH avg_pop AS( SELECT countries.continent,
ROUND(AVG(CASE WHEN population_years.year = 2000 THEN population_years.population END), 2) as avg_pop_2000,
ROUND(AVG(CASE WHEN population_years.year = 2010 THEN population_years.population END), 2) as avg_pop_2010
FROM countries
JOIN population_years
WHERE population_years.country_id = countries.id
GROUP BY 1)
SELECT continent, ROUND((( avg_pop_2000 - avg_pop_2010)/avg_pop_2010)*100, 2) AS '%diff'
FROM avg_pop;
这是一个小演示:
在查看其他评论和答案时,我意识到您需要提供不带 '' 的不同别名,一切都会起作用...我已经更新了我的答案和演示。
另一种选择是重复表达式 - 稍作优化:
select
c.continent,
round(avg(case when py.year = 2000 then py.population end), 2) avg_pop_2000,
round(avg(case when py.year = 2010 then py.population end), 2) avg_pop_2010,
round(
100.* avg(case when py.year = 2010 then py.population else - py.population end)
/ avg(case when py.year = 2000 then py.population end),
2
) percent_diff
from countries c
inner join population_years py on py.country_id = c.id
where py.year in (2010, 2020)
group by c.continent
旁注:
不要对标识符使用单引号!它们代表标准 SQL 中的文字字符串;通常,您应该更喜欢不需要引用的标识符。如果需要引号,请使用标准双引号 ("
),SQLite 可以识别
使用 where
子句预过滤相关年份使查询更高效
使用标准连接语法;连接条件转到连接的 on
子句,而不是 where
子句
四舍五入,计算百分比差异不准确;先计算,再舍入
table 别名使查询更易于编写和阅读
我有 2 个 tables - 国家 (id, name, continent) 和 population_years (id, population, year, country_id)。数据从 2000 年到 2010 年,我正在尝试计算这段时间每个大陆平均人口的百分比差异。我试图通过创建一个临时的 table 来实现它,它会产生以下输出:
但是当我尝试计算 % diff 时(您可以从我下面的代码中看到),我不知道如何将代码中的 'avg pop 2000' 和 'avg pop 2010' 列引用为他们没有被分配一个我可以参考的变量。在代码中,我使用 avg_pop_2010 和 avg_pop_2000 来引用这些列 - 显然这实际上不起作用。
WITH avg_pop AS( SELECT countries.continent,
ROUND(AVG(CASE WHEN population_years.year = 2000 THEN population_years.population END), 2) as 'avg pop 2000',
ROUND(AVG(CASE WHEN population_years.year = 2010 THEN population_years.population END), 2) as 'avg pop 2010'
FROM countries
JOIN population_years
WHERE population_years.country_id = countries.id
GROUP BY 1)
SELECT countries.continent, ROUND(((avg_pop_2010 - avg_pop_2000)/avg_pop_2000)*100.0, 2) AS '%diff'
FROM avg_pop;
WITH avg_pop AS( SELECT countries.continent,
ROUND(AVG(CASE WHEN population_years.year = 2000 THEN population_years.population END), 2) as avg_pop_2000,
ROUND(AVG(CASE WHEN population_years.year = 2010 THEN population_years.population END), 2) as avg_pop_2010
FROM countries
JOIN population_years
WHERE population_years.country_id = countries.id
GROUP BY 1)
SELECT continent, ROUND((( avg_pop_2000 - avg_pop_2010)/avg_pop_2010)*100, 2) AS '%diff'
FROM avg_pop;
这是一个小演示:
在查看其他评论和答案时,我意识到您需要提供不带 '' 的不同别名,一切都会起作用...我已经更新了我的答案和演示。
另一种选择是重复表达式 - 稍作优化:
select
c.continent,
round(avg(case when py.year = 2000 then py.population end), 2) avg_pop_2000,
round(avg(case when py.year = 2010 then py.population end), 2) avg_pop_2010,
round(
100.* avg(case when py.year = 2010 then py.population else - py.population end)
/ avg(case when py.year = 2000 then py.population end),
2
) percent_diff
from countries c
inner join population_years py on py.country_id = c.id
where py.year in (2010, 2020)
group by c.continent
旁注:
不要对标识符使用单引号!它们代表标准 SQL 中的文字字符串;通常,您应该更喜欢不需要引用的标识符。如果需要引号,请使用标准双引号 (
"
),SQLite 可以识别使用
where
子句预过滤相关年份使查询更高效使用标准连接语法;连接条件转到连接的
on
子句,而不是where
子句四舍五入,计算百分比差异不准确;先计算,再舍入
table 别名使查询更易于编写和阅读