8.2.2.4 通过合并或实现优化派生表和视图引用

MySQL 8.0 参考手册 / 第8章优化 / 8.2 优化SQL语句 / 8.2.2 优化子查询、派生表、视图引用和公用表表达式 / 8.2.2.4 通过合并或实现优化派生表和视图引用

8.2.2.4 通过合并或实现优化派生表和视图引用

优化器可以使用两种策略（也适用于视图引用）处理派生表引用：

将派生表合并到外部查询块中
将派生表具体化为内部临时表

示例 1：

SELECT * FROM (SELECT * FROM t1) AS derived_t1;

通过合并派生表 derived_t1，该查询的执行类似于：

SELECT * FROM t1;

示例 2：

SELECT *
  FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2 ON t1.f2=derived_t2.f1
  WHERE t1.f1 > 0;

通过合并派生表 derived_t2，该查询的执行类似于：

SELECT t1.*, t2.f1
  FROM t1 JOIN t2 ON t1.f2=t2.f1
  WHERE t1.f1 > 0;

使用物化，derived_t1并且 derived_t2在各自的查询中都被视为一个单独的表。

优化器以相同的方式处理派生表和视图引用：它尽可能避免不必要的物化，这可以将条件从外部查询下推到派生表并生成更有效的执行计划。（有关示例，请参阅第 8.2.2.2 节，“使用物化优化子查询”。）

如果合并会导致引用超过 61 个基表的外部查询块，则优化器会选择具体化。

ORDER BY如果这些条件都为真，优化器将派生表或视图引用中的子句传播到外部查询块：

The outer query is not grouped or aggregated.
The outer query does not specify DISTINCT, HAVING, or ORDER BY.
The outer query has this derived table or view reference as the only source in the FROM clause.

Otherwise, the optimizer ignores the ORDER BY clause.

The following means are available to influence whether the optimizer attempts to merge derived tables and view references into the outer query block:

可以使用系统变量的derived_merge标志optimizer_switch ，假设没有其他规则阻止合并。请参阅第 8.9.2 节，“可切换优化”。默认情况下，启用该标志以允许合并。禁用该标志可防止合并并避免 ER_UPDATE_TABLE_USED 错误。
该derived_merge标志也适用于不包含 ALGORITHM子句的视图。因此，如果 ER_UPDATE_TABLE_USED使用与子查询等效的表达式的视图引用发生错误，添加 ALGORITHM=TEMPTABLE到视图定义会阻止合并并优先于 derived_merge值。
可以通过在子查询中使用任何阻止合并的构造来禁用合并，尽管它们对实现的影响并不那么明确。防止合并的构造对于派生表和视图引用是相同的：
- Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
- DISTINCT
- GROUP BY
- HAVING
- LIMIT
- UNION or UNION ALL
- Subqueries in the select list
- Assignments to user variables
- Refererences only to literal values (in this case, there is no underlying table)

The derived_merge flag also applies to views that contain no ALGORITHM clause. Thus, if an ER_UPDATE_TABLE_USED error occurs for a view reference that uses an expression equivalent to the subquery, adding ALGORITHM=TEMPTABLE to the view definition prevents merging and takes precedence over the current derived_merge value.

If the optimizer chooses the materialization strategy rather than merging for a derived table, it handles the query as follows:

The optimizer postpones derived table materialization until its contents are needed during query execution. This improves performance because delaying materialization may result in not having to do it at all. Consider a query that joins the result of a derived table to another table: If the optimizer processes that other table first and finds that it returns no rows, the join need not be carried out further and the optimizer can completely skip materializing the derived table.
During query execution, the optimizer may add an index to a derived table to speed up row retrieval from it.

Consider the following EXPLAIN statement, for a SELECT query that contains a derived table:

EXPLAIN SELECT * FROM (SELECT * FROM t1) AS derived_t1;

The optimizer avoids materializing the derived table by delaying it until the result is needed during SELECT execution. In this case, the query is not executed (because it occurs in an EXPLAIN statement), so the result is never needed.

Even for queries that are executed, delay of derived table materialization may enable the optimizer to avoid materialization entirely. When this happens, query execution is quicker by the time needed to perform materialization. Consider the following query, which joins the result of a derived table to another table:

SELECT *
  FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2
          ON t1.f2=derived_t2.f1
  WHERE t1.f1 > 0;

If the optimization processes t1 first and the WHERE clause produces an empty result, the join must necessarily be empty and the derived table need not be materialized.

For cases when a derived table requires materialization, the optimizer may add an index to the materialized table to speed up access to it. If such an index enables ref access to the table, it can greatly reduce amount of data read during query execution. Consider the following query:

SELECT *
 FROM t1 JOIN (SELECT DISTINCT f1 FROM t2) AS derived_t2
         ON t1.f1=derived_t2.f1;

The optimizer constructs an index over column f1 from derived_t2 if doing so would enable use of ref access for the lowest cost execution plan. After adding the index, the optimizer can treat the materialized derived table the same as a regular table with an index, and it benefits similarly from the generated index. The overhead of index creation is negligible compared to the cost of query execution without the index. If ref access would result in higher cost than some other access method, the optimizer creates no index and loses nothing.

对于优化器跟踪输出，合并的派生表或视图引用未显示为节点。只有它的基础表出现在顶级查询的计划中。