在 MySQL 视图中分组扫描所有分区

Grouping in MySQL view scanning all partitions

我想在MySQL中创建一个视图,让数据分析用户可以轻松过滤大量数据,但是当我创建一个有任何分组的视图时,整个视图都会被扫描,使视图无用在性能方面。

一个简单的例子

值 Table - 约 35 亿行,每月分区

SELECT
    Timestamp,
    DeviceId,
    SUM(Entry)
    FROM Value v
    WHERE DeviceId = 123456 AND Timestamp >= '2020-08-01'AND Timestamp <= '2020-08-30'
    GROUP BY Timestamp, DeviceId;

使用 EXPLAIN 我可以看到查询使用主键 (DeviceId,Timestamp) select 扫描了 August 分区和 returns 它在 63 毫秒内的值,select 类型为 'SIMPLE'

当我创建这个视图时,省略了 WHERE 子句,EXPLAIN 命令显示当使用

SELECT * FROM vTest WHERE deviceid = 123456 AND Timestamp >= '2020-08-01'AND Timestamp <= '2020-08-30'

扫描所有分区,select 类型为 DERIVED 并且主键被识别为可能的键,但未使用。这使得查询“永远”。

如果我创建一个没有分组的视图,这个问题就不会发生并且视图使用正确的 indexes/keys 来扫描底层 table.

是否可以在视图中使用分组,'pass the where clause to the underlying table'还是视图的用户总是需要自己执行分组。

GCP 托管 MySQL 5.7.25

Mysql可以使用two algorithms处理一个视图:

For MERGE, the text of a statement that refers to the view and the view definition are merged such that parts of the view definition replace corresponding parts of the statement.

For TEMPTABLE, the results from the view are retrieved into a temporary table, which then is used to execute the statement.

For UNDEFINED, MySQL chooses which algorithm to use. It prefers MERGE over TEMPTABLE if possible, because MERGE is usually more efficient and because a view cannot be updatable if a temporary table is used.

根据 mysql 手册的 restrictions on views 部分:

Indexes can be used for views processed using the merge algorithm. However, a view that is processed with the temptable algorithm is unable to take advantage of indexes on its underlying tables (although indexes can be used during generation of the temporary tables).

用于创建视图的select 语句包含一个group by 子句。根据 mysql 手册的 8.2.2.4 Optimizing Derived Tables, View References, and Common Table Expressions with Merging or Materialization 部分:

Constructs that prevent merging are the same for derived tables, common table expressions, and view references:

Aggregate functions or window functions (SUM(), MIN(), MAX(), COUNT(), and so forth)

DISTINCT

GROUP BY

HAVING

LIMIT

UNION or UNION ALL

Subqueries in the select list

Assignments to user variables

Refererences only to literal values (in this case, there is no underlying table)

由于 group by 子句,temptable 算法用于视图。这导致 mysql 首先将视图具体化为临时 table 而不会从外部查询中下推过滤条件,从而导致您在解释中看到更广泛的扫描。过滤发生在临时 table 上,无法利用基础 table 上的索引。

如果 mysql 对视图使用合并或临时 table 方法,您确实需要注意,因为视图的行为在很大程度上取决于此选择。