每组最大 n 优化
greatest n per group optimization
如何优化此 SQLite 查询?目前 运行 需要 31 秒。我想在 Web 应用程序上显示前 10 名股票上涨者和前 10 名股票下跌者。 table 有 200 万行,并且每天都有新的价格数据可用,应该会略微增加。
如果这不可能,我可以创建一个计划任务来将这些结果缓存到临时数据库 table 或临时文件中。看起来像是额外的工作,但如果需要也可以。
WITH todayPrices AS (
SELECT * FROM (
SELECT *, row_number() OVER (
PARTITION BY CompanyID
ORDER BY Date DESC
) AS rn
FROM DimCompanyPrice
) a
WHERE rn = 1
ORDER BY CompanyID ASC
),
yestPrices AS (
SELECT * FROM (
SELECT *, row_number() OVER (
PARTITION BY CompanyID
ORDER BY Date DESC
) AS rn
FROM DimCompanyPrice
) a
WHERE rn = 2
ORDER BY CompanyID ASC
)
SELECT todayPrices.CompanyID, 100.0 * (todayPrices.CloseAdjusted-yestPrices.CloseAdjusted) / yestPrices.CloseAdjusted AS gain
FROM todayPrices
INNER JOIN yestPrices on todayPrices.CompanyID=yestPrices.CompanyID
ORDER BY gain DESC
LIMIT 10
我想就什么是使它更好地执行的最佳方法提供一些意见。如有任何意见,我们将不胜感激。
EXPLAIN QUERY PLAN
的结果:
id parent notused detail
3 0 0 MATERIALIZE 2
5 3 0 CO-ROUTINE 1
8 5 0 CO-ROUTINE 6
11 8 0 SCAN TABLE DimCompanyPrice
36 8 0 USE TEMP B-TREE FOR ORDER BY
62 5 0 SCAN SUBQUERY 6
134 3 0 SCAN SUBQUERY 1 AS a
163 3 0 USE TEMP B-TREE FOR ORDER BY
174 0 0 MATERIALIZE 4
176 174 0 CO-ROUTINE 3
179 176 0 CO-ROUTINE 7
182 179 0 SCAN TABLE DimCompanyPrice
207 179 0 USE TEMP B-TREE FOR ORDER BY
233 176 0 SCAN SUBQUERY 7
305 174 0 SCAN SUBQUERY 3 AS a
334 174 0 USE TEMP B-TREE FOR ORDER BY
345 0 0 SCAN SUBQUERY 4
357 0 0 SEARCH SUBQUERY 2 USING AUTOMATIC COVERING INDEX (CompanyID=?)
382 0 0 USE TEMP B-TREE FOR ORDER BY
我会建议使用条件聚合来计算列 gain
的查询。
为此,您将需要一次 table 扫描以根据日期对每个公司的行进行排名,然后过滤掉排名大于 2 的行,最后聚合:
SELECT CompanyId,
100 * (MAX(CASE WHEN rn = 1 THEN CloseAdjusted END) / MAX(CASE WHEN rn = 2 THEN CloseAdjusted END) - 1) gain
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY CompanyId ORDER BY Date DESC) rn
FROM DimCompanyPrice
)
WHERE rn <= 2
GROUP BY CompanyId
ORDER BY gain DESC;
但是如果今天和昨天你实际上是指当前日期和前一个日期,你也可以用这个查询来完成:
SELECT CompanyId,
100 * (MAX(CASE WHEN Date = CURRENT_DATE THEN CloseAdjusted END) / MAX(CASE WHEN Date = Date(CURRENT_DATE, '-1 day') THEN CloseAdjusted END) - 1) gain
FROM DimCompanyPrice
WHERE Date >= Date(CURRENT_DATE, '-1 day')
GROUP BY CompanyId;
如何优化此 SQLite 查询?目前 运行 需要 31 秒。我想在 Web 应用程序上显示前 10 名股票上涨者和前 10 名股票下跌者。 table 有 200 万行,并且每天都有新的价格数据可用,应该会略微增加。
如果这不可能,我可以创建一个计划任务来将这些结果缓存到临时数据库 table 或临时文件中。看起来像是额外的工作,但如果需要也可以。
WITH todayPrices AS (
SELECT * FROM (
SELECT *, row_number() OVER (
PARTITION BY CompanyID
ORDER BY Date DESC
) AS rn
FROM DimCompanyPrice
) a
WHERE rn = 1
ORDER BY CompanyID ASC
),
yestPrices AS (
SELECT * FROM (
SELECT *, row_number() OVER (
PARTITION BY CompanyID
ORDER BY Date DESC
) AS rn
FROM DimCompanyPrice
) a
WHERE rn = 2
ORDER BY CompanyID ASC
)
SELECT todayPrices.CompanyID, 100.0 * (todayPrices.CloseAdjusted-yestPrices.CloseAdjusted) / yestPrices.CloseAdjusted AS gain
FROM todayPrices
INNER JOIN yestPrices on todayPrices.CompanyID=yestPrices.CompanyID
ORDER BY gain DESC
LIMIT 10
我想就什么是使它更好地执行的最佳方法提供一些意见。如有任何意见,我们将不胜感激。
EXPLAIN QUERY PLAN
的结果:
id parent notused detail
3 0 0 MATERIALIZE 2
5 3 0 CO-ROUTINE 1
8 5 0 CO-ROUTINE 6
11 8 0 SCAN TABLE DimCompanyPrice
36 8 0 USE TEMP B-TREE FOR ORDER BY
62 5 0 SCAN SUBQUERY 6
134 3 0 SCAN SUBQUERY 1 AS a
163 3 0 USE TEMP B-TREE FOR ORDER BY
174 0 0 MATERIALIZE 4
176 174 0 CO-ROUTINE 3
179 176 0 CO-ROUTINE 7
182 179 0 SCAN TABLE DimCompanyPrice
207 179 0 USE TEMP B-TREE FOR ORDER BY
233 176 0 SCAN SUBQUERY 7
305 174 0 SCAN SUBQUERY 3 AS a
334 174 0 USE TEMP B-TREE FOR ORDER BY
345 0 0 SCAN SUBQUERY 4
357 0 0 SEARCH SUBQUERY 2 USING AUTOMATIC COVERING INDEX (CompanyID=?)
382 0 0 USE TEMP B-TREE FOR ORDER BY
我会建议使用条件聚合来计算列 gain
的查询。
为此,您将需要一次 table 扫描以根据日期对每个公司的行进行排名,然后过滤掉排名大于 2 的行,最后聚合:
SELECT CompanyId,
100 * (MAX(CASE WHEN rn = 1 THEN CloseAdjusted END) / MAX(CASE WHEN rn = 2 THEN CloseAdjusted END) - 1) gain
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY CompanyId ORDER BY Date DESC) rn
FROM DimCompanyPrice
)
WHERE rn <= 2
GROUP BY CompanyId
ORDER BY gain DESC;
但是如果今天和昨天你实际上是指当前日期和前一个日期,你也可以用这个查询来完成:
SELECT CompanyId,
100 * (MAX(CASE WHEN Date = CURRENT_DATE THEN CloseAdjusted END) / MAX(CASE WHEN Date = Date(CURRENT_DATE, '-1 day') THEN CloseAdjusted END) - 1) gain
FROM DimCompanyPrice
WHERE Date >= Date(CURRENT_DATE, '-1 day')
GROUP BY CompanyId;