什么会导致基于 month() 和 year() 的 mysql 查询比直接查询其等效数字更快?
What would cause a mysql query based on month() and year() to be faster than querying against its numeric equivalent directly?
我试图通过在插入时将 YEAR() 和 MONTH() 函数替换为它们的数字等价物来加速 MySql 5.7 中的 SQL 查询。具体来说,我为此添加了列 reportMonth、reportYear 和 bigint(20)。
有趣的是,这种方法要慢得多。为什么? 运行 函数较少的查询不是应该更快吗?
这大约需要 12 秒才能完成。 (使用 YEAR() 和 MONTH() 函数)
SELECT
ProductTitle AS 'ProductTitle',
YEAR(ReportPeriodEndDay) AS 'Year',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 1 THEN OrderedRevenue END) AS 'Jan',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 2 THEN OrderedRevenue END) AS 'Feb',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 3 THEN OrderedRevenue END) AS 'Mar',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 4 THEN OrderedRevenue END) AS 'Apr',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 5 THEN OrderedRevenue END) AS 'May',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 6 THEN OrderedRevenue END) AS 'Jun',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 7 THEN OrderedRevenue END) AS 'Jul',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 8 THEN OrderedRevenue END) AS 'Aug',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 9 THEN OrderedRevenue END) AS 'Sep',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 10 THEN OrderedRevenue END) AS 'Oct',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 11 THEN OrderedRevenue END) AS 'Nov',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 12 THEN OrderedRevenue END) AS 'Dec',
SUM(OrderedRevenue) AS 'TOTAL'
FROM
`sales_diagnostic_summary_orderedrevenuelevel`
GROUP BY ProductTitle, Year
WITH ROLLUP;
EXPLAIN
的结果
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, sales_diagnostic_summary_orderedrevenuelevel, , ALL, , , , , 745140, 100.00, Using temporary; Using filesort
这需要超过 120 秒(使用等效数字)
SELECT
ProductTitle AS 'ProductTitle',
reportYear AS 'Year',
SUM(CASE WHEN reportMonth = 1 THEN OrderedRevenue END) AS 'Jan',
SUM(CASE WHEN reportMonth = 2 THEN OrderedRevenue END) AS 'Feb',
SUM(CASE WHEN reportMonth = 3 THEN OrderedRevenue END) AS 'Mar',
SUM(CASE WHEN reportMonth = 4 THEN OrderedRevenue END) AS 'Apr',
SUM(CASE WHEN reportMonth = 5 THEN OrderedRevenue END) AS 'May',
SUM(CASE WHEN reportMonth = 6 THEN OrderedRevenue END) AS 'Jun',
SUM(CASE WHEN reportMonth = 7 THEN OrderedRevenue END) AS 'Jul',
SUM(CASE WHEN reportMonth = 8 THEN OrderedRevenue END) AS 'Aug',
SUM(CASE WHEN reportMonth = 9 THEN OrderedRevenue END) AS 'Sep',
SUM(CASE WHEN reportMonth = 10 THEN OrderedRevenue END) AS 'Oct',
SUM(CASE WHEN reportMonth = 11 THEN OrderedRevenue END) AS 'Nov',
SUM(CASE WHEN reportMonth = 12 THEN OrderedRevenue END) AS 'Dec',
SUM(OrderedRevenue) AS 'TOTAL'
FROM
`sales_diagnostic_summary_orderedrevenuelevel`
GROUP BY ProductTitle, Year
WITH ROLLUP;
EXPLAIN
的结果
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, sales_diagnostic_summary_orderedrevenuelevel, , ALL, , , , , 745140, 100.00, Using filesort
Table 映射通过 DESCRIBE
# Field, Type, Null, Key, Default, Extra
ASIN, text, YES, MUL, ,
ProductTitle, text, YES, , ,
OrderedRevenue, double, YES, , ,
OrderedRevenuePercentOfTotal, double, YES, , ,
OrderedRevenuePriorPeriod, double, YES, , ,
OrderedRevenueLastYear, double, YES, , ,
OrderedUnits, double, YES, , ,
OrderedUnitsPercentOfTotal, double, YES, , ,
OrderedUnitsPriorPeriod, double, YES, , ,
OrderedUnitsLastYear, double, YES, , ,
SubcategorySalesRank, bigint(20), YES, , ,
SubcategoryBetterWorse, double, YES, , ,
AverageSalesPrice, double, YES, , ,
AverageSalesPricePriorPeriod, double, YES, , ,
ChangeInGVPriorPeriod, double, YES, , ,
ChangeInGVLastYear, double, YES, , ,
RepOOS, double, YES, , ,
RepOOSPercentOfTotal, double, YES, , ,
RepOOSPriorPeriod, double, YES, , ,
LBBPrice, double, YES, , ,
ReportPeriodStartDay, datetime, YES, , ,
ReportPeriodEndDay, datetime, YES, , ,
ReportDownloadDate, datetime, YES, , ,
ReportPeriod, text, YES, , ,
ReportFilename, text, YES, , ,
marketplace, text, YES, , ,
vendorId, text, YES, , ,
reportYear, bigint(20), YES, MUL, ,
reportMonth, bigint(20), YES, MUL, ,
reportWeek, bigint(20), YES, , ,
reportQuarter, bigint(20), YES, , ,
reportDayOfWeek, bigint(20), YES, , ,
reportDayOfYear, bigint(20), YES, , ,
似乎 一些优化链接到 YEAR
函数,DESCRIBE
对此一无所知(这是合乎逻辑的)。
我的实现方式是,当 YEAR 函数被调用时,如果它发现 MONTH 也被调用,它会对月份值进行额外的装箱。然后,这部分工作已经完成,并且比通过一个不相关领域的 CASE 更好(因为它被称为 reportMonth
并不能使它相关)。
由于每年不超过 12 个月,这似乎是一个值得的优化 - 它不会使用太多内存并且潜在的回报是可观的。
如果每个产品的销售额很大,您可以尝试按 reportYear 和 reportMonth 直接分组,然后 运行 将 CASE 旋转为包装 SELECT。类似于:
SELECT
ProductTitle,
reportYear as `Year`,
SUM(IF (reportMonth = 1, OrderedRevenue, 0) AS 'Jan',
...
SUM(IF (reportMonth = 12, OrderedRevenue, 0) AS 'Dec',
SUM(OrderedRevenue) AS 'TOTAL'
FROM (
SELECT productTitle,
reportYear,
reportMonth,
SUM(OrderedRevenue) AS OrderedRevenue
FROM
`sales_diagnostic_summary_orderedrevenuelevel`
GROUP BY ProductTitle, reportYear, reportMonth
) AS firstGrouping;
很有可能,有索引
CREATE INDEX myIndex ON
sales_diagnostic_summary_orderedrevenuelevel(ProductTitle,
reportYear, reportMonth, OrderedRevenue);
虽然在 UPDATE/DELETE/INSERTs 期间花费了一些东西,但在这种 SELECT 期间应该有所改善。您可能想尝试 DATE 版本的 double-select 和 indexing on for size.
此外,我认为没有任何理由将年、月和周存储为 BIGINT。它不会在性能或存储方面产生太大差异,但我仍然闻起来有点难闻。
我试图通过在插入时将 YEAR() 和 MONTH() 函数替换为它们的数字等价物来加速 MySql 5.7 中的 SQL 查询。具体来说,我为此添加了列 reportMonth、reportYear 和 bigint(20)。
有趣的是,这种方法要慢得多。为什么? 运行 函数较少的查询不是应该更快吗?
这大约需要 12 秒才能完成。 (使用 YEAR() 和 MONTH() 函数)
SELECT
ProductTitle AS 'ProductTitle',
YEAR(ReportPeriodEndDay) AS 'Year',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 1 THEN OrderedRevenue END) AS 'Jan',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 2 THEN OrderedRevenue END) AS 'Feb',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 3 THEN OrderedRevenue END) AS 'Mar',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 4 THEN OrderedRevenue END) AS 'Apr',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 5 THEN OrderedRevenue END) AS 'May',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 6 THEN OrderedRevenue END) AS 'Jun',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 7 THEN OrderedRevenue END) AS 'Jul',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 8 THEN OrderedRevenue END) AS 'Aug',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 9 THEN OrderedRevenue END) AS 'Sep',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 10 THEN OrderedRevenue END) AS 'Oct',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 11 THEN OrderedRevenue END) AS 'Nov',
SUM(CASE WHEN MONTH(ReportPeriodEndDay) = 12 THEN OrderedRevenue END) AS 'Dec',
SUM(OrderedRevenue) AS 'TOTAL'
FROM
`sales_diagnostic_summary_orderedrevenuelevel`
GROUP BY ProductTitle, Year
WITH ROLLUP;
EXPLAIN
的结果# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, sales_diagnostic_summary_orderedrevenuelevel, , ALL, , , , , 745140, 100.00, Using temporary; Using filesort
这需要超过 120 秒(使用等效数字)
SELECT
ProductTitle AS 'ProductTitle',
reportYear AS 'Year',
SUM(CASE WHEN reportMonth = 1 THEN OrderedRevenue END) AS 'Jan',
SUM(CASE WHEN reportMonth = 2 THEN OrderedRevenue END) AS 'Feb',
SUM(CASE WHEN reportMonth = 3 THEN OrderedRevenue END) AS 'Mar',
SUM(CASE WHEN reportMonth = 4 THEN OrderedRevenue END) AS 'Apr',
SUM(CASE WHEN reportMonth = 5 THEN OrderedRevenue END) AS 'May',
SUM(CASE WHEN reportMonth = 6 THEN OrderedRevenue END) AS 'Jun',
SUM(CASE WHEN reportMonth = 7 THEN OrderedRevenue END) AS 'Jul',
SUM(CASE WHEN reportMonth = 8 THEN OrderedRevenue END) AS 'Aug',
SUM(CASE WHEN reportMonth = 9 THEN OrderedRevenue END) AS 'Sep',
SUM(CASE WHEN reportMonth = 10 THEN OrderedRevenue END) AS 'Oct',
SUM(CASE WHEN reportMonth = 11 THEN OrderedRevenue END) AS 'Nov',
SUM(CASE WHEN reportMonth = 12 THEN OrderedRevenue END) AS 'Dec',
SUM(OrderedRevenue) AS 'TOTAL'
FROM
`sales_diagnostic_summary_orderedrevenuelevel`
GROUP BY ProductTitle, Year
WITH ROLLUP;
EXPLAIN
的结果# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, sales_diagnostic_summary_orderedrevenuelevel, , ALL, , , , , 745140, 100.00, Using filesort
Table 映射通过 DESCRIBE
# Field, Type, Null, Key, Default, Extra
ASIN, text, YES, MUL, ,
ProductTitle, text, YES, , ,
OrderedRevenue, double, YES, , ,
OrderedRevenuePercentOfTotal, double, YES, , ,
OrderedRevenuePriorPeriod, double, YES, , ,
OrderedRevenueLastYear, double, YES, , ,
OrderedUnits, double, YES, , ,
OrderedUnitsPercentOfTotal, double, YES, , ,
OrderedUnitsPriorPeriod, double, YES, , ,
OrderedUnitsLastYear, double, YES, , ,
SubcategorySalesRank, bigint(20), YES, , ,
SubcategoryBetterWorse, double, YES, , ,
AverageSalesPrice, double, YES, , ,
AverageSalesPricePriorPeriod, double, YES, , ,
ChangeInGVPriorPeriod, double, YES, , ,
ChangeInGVLastYear, double, YES, , ,
RepOOS, double, YES, , ,
RepOOSPercentOfTotal, double, YES, , ,
RepOOSPriorPeriod, double, YES, , ,
LBBPrice, double, YES, , ,
ReportPeriodStartDay, datetime, YES, , ,
ReportPeriodEndDay, datetime, YES, , ,
ReportDownloadDate, datetime, YES, , ,
ReportPeriod, text, YES, , ,
ReportFilename, text, YES, , ,
marketplace, text, YES, , ,
vendorId, text, YES, , ,
reportYear, bigint(20), YES, MUL, ,
reportMonth, bigint(20), YES, MUL, ,
reportWeek, bigint(20), YES, , ,
reportQuarter, bigint(20), YES, , ,
reportDayOfWeek, bigint(20), YES, , ,
reportDayOfYear, bigint(20), YES, , ,
似乎 一些优化链接到 YEAR
函数,DESCRIBE
对此一无所知(这是合乎逻辑的)。
我的实现方式是,当 YEAR 函数被调用时,如果它发现 MONTH 也被调用,它会对月份值进行额外的装箱。然后,这部分工作已经完成,并且比通过一个不相关领域的 CASE 更好(因为它被称为 reportMonth
并不能使它相关)。
由于每年不超过 12 个月,这似乎是一个值得的优化 - 它不会使用太多内存并且潜在的回报是可观的。
如果每个产品的销售额很大,您可以尝试按 reportYear 和 reportMonth 直接分组,然后 运行 将 CASE 旋转为包装 SELECT。类似于:
SELECT
ProductTitle,
reportYear as `Year`,
SUM(IF (reportMonth = 1, OrderedRevenue, 0) AS 'Jan',
...
SUM(IF (reportMonth = 12, OrderedRevenue, 0) AS 'Dec',
SUM(OrderedRevenue) AS 'TOTAL'
FROM (
SELECT productTitle,
reportYear,
reportMonth,
SUM(OrderedRevenue) AS OrderedRevenue
FROM
`sales_diagnostic_summary_orderedrevenuelevel`
GROUP BY ProductTitle, reportYear, reportMonth
) AS firstGrouping;
很有可能,有索引
CREATE INDEX myIndex ON
sales_diagnostic_summary_orderedrevenuelevel(ProductTitle,
reportYear, reportMonth, OrderedRevenue);
虽然在 UPDATE/DELETE/INSERTs 期间花费了一些东西,但在这种 SELECT 期间应该有所改善。您可能想尝试 DATE 版本的 double-select 和 indexing on for size.
此外,我认为没有任何理由将年、月和周存储为 BIGINT。它不会在性能或存储方面产生太大差异,但我仍然闻起来有点难闻。