如何过滤NTILE window函数?
How to filter NTILE window function?
以下是我尝试查看投球局数排名前 98% 的投手。我正在尝试过滤 NTILE() window 函数,但它抛出错误。
query = """
WITH pitching_cte AS
(
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player
)
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND pctl >= 98;
"""
df = pd.read_sql(query, cnxn)
print(df.head(10))
当我执行此查询时,出现以下错误:
DatabaseError: Execution failed on sql '
WITH pitching_cte AS
(
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player)
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND pctl >= 98;
': misuse of aliased window function pctl
如何过滤生成的百分位数列?
谢谢
通常无法在 WHERE 子句中使用 SELECT 列表中定义的别名。
另一个限制是窗口函数不能在 WHERE 子句中使用,所以下面的语句也会失败:
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND (NTILE(100) OVER(PARTITION BY innings_pitched)) >= 98;
解决方案是使用另一层cte:
WITH pitching_cte AS (
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player
), cte AS (
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021
)
SELECT *
FROM cte
WHERE pctl >= 98;
相关:Why no windowed functions in where clauses?
以下是我尝试查看投球局数排名前 98% 的投手。我正在尝试过滤 NTILE() window 函数,但它抛出错误。
query = """
WITH pitching_cte AS
(
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player
)
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND pctl >= 98;
"""
df = pd.read_sql(query, cnxn)
print(df.head(10))
当我执行此查询时,出现以下错误:
DatabaseError: Execution failed on sql '
WITH pitching_cte AS
(
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player)
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND pctl >= 98;
': misuse of aliased window function pctl
如何过滤生成的百分位数列?
谢谢
通常无法在 WHERE 子句中使用 SELECT 列表中定义的别名。
另一个限制是窗口函数不能在 WHERE 子句中使用,所以下面的语句也会失败:
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND (NTILE(100) OVER(PARTITION BY innings_pitched)) >= 98;
解决方案是使用另一层cte:
WITH pitching_cte AS (
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player
), cte AS (
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021
)
SELECT *
FROM cte
WHERE pctl >= 98;
相关:Why no windowed functions in where clauses?