Documentation Home
MySQL 8.0 参考手册  / 第8章优化  / 8.3 优化和索引  /  8.3.10 Optimizer Use of Generated Column Indexes

8.3.10 Optimizer Use of Generated Column Indexes

MySQL supports indexes on generated columns. For example:

CREATE TABLE t1 (f1 INT, gc INT AS (f1 + 1) STORED, INDEX (gc));

The generated column, gc, is defined as the expression f1 + 1. The column is also indexed and the optimizer can take that index into account during execution plan construction. In the following query, the WHERE clause refers to gc and the optimizer considers whether the index on that column yields a more efficient plan:

SELECT * FROM t1 WHERE gc > 9;

优化器可以使用生成的列上的索引来生成执行计划,即使在查询中没有按名称直接引用这些列。如果 WHERE, ORDER BY, or GROUP BY子句引用与某些索引生成列的定义相匹配的表达式,则会发生这种情况。以下查询不直接引用,gc 但确实使用了与 的定义相匹配的表达式 gc

SELECT * FROM t1 WHERE f1 + 1 > 9;

优化器识别出该表达式f1 + 1与 的定义匹配gc并且已gc被索引,因此它在执行计划构造期间考虑该索引。你可以看到这个使用 EXPLAIN

mysql> EXPLAIN SELECT * FROM t1 WHERE f1 + 1 > 9\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: range
possible_keys: gc
          key: gc
      key_len: 5
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index condition

实际上,优化器已将表达式替换为与表达式f1 + 1匹配的生成列的名称。EXPLAIN 这在以下显示的扩展信息中可用的重写查询中也很明显SHOW WARNINGS

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `test`.`t1`.`f1` AS `f1`,`test`.`t1`.`gc`
         AS `gc` from `test`.`t1` where (`test`.`t1`.`gc` > 9)

以下限制和条件适用于优化器对生成的列索引的使用:

  • For a query expression to match a generated column definition, the expression must be identical and it must have the same result type. For example, if the generated column expression is f1 + 1, the optimizer does not recognize a match if the query uses 1 + f1, or if f1 + 1 (an integer expression) is compared with a string.

  • The optimization applies to these operators: =, <, <=, >, >=, BETWEEN, and IN().

    For operators other than BETWEEN and IN(), either operand can be replaced by a matching generated column. For BETWEEN and IN(), only the first argument can be replaced by a matching generated column, and the other arguments must have the same result type. BETWEEN and IN() are not yet supported for comparisons involving JSON values.

  • The generated column must be defined as an expression that contains at least a function call or one of the operators mentioned in the preceding item. The expression cannot consist of a simple reference to another column. For example, gc INT AS (f1) STORED consists only of a column reference, so indexes on gc are not considered.

  • For comparisons of strings to indexed generated columns that compute a value from a JSON function that returns a quoted string, JSON_UNQUOTE() is needed in the column definition to remove the extra quotes from the function value. (For direct comparison of a string to the function result, the JSON comparator handles quote removal, but this does not occur for index lookups.) For example, instead of writing a column definition like this:

    doc_name TEXT AS (JSON_EXTRACT(jdoc, '$.name')) STORED

    Write it like this:

    doc_name TEXT AS (JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name'))) STORED

    With the latter definition, the optimizer can detect a match for both of these comparisons:

    ... WHERE JSON_EXTRACT(jdoc, '$.name') = 'some_string' ...
    ... WHERE JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name')) = 'some_string' ...

    如果不在JSON_UNQUOTE()列定义中,优化器只会检测第一个比较的匹配项。

  • 如果优化器未能选择所需的索引,则可以使用索引提示强制优化器做出不同的选择。