13.1.18.8 二级索引和生成的列

MySQL 8.0 参考手册 / 第 13 章 SQL 语句 / 13.1 数据定义语句 / 13.1.20 CREATE TABLE 语句 / 13.1.18.8 二级索引和生成的列

13.1.18.8 二级索引和生成的列

InnoDB支持虚拟生成列的二级索引。不支持其他索引类型。在虚拟列上定义的二级索引有时称为“虚拟索引”。

二级索引可以在一个或多个虚拟列上创建，或者在虚拟列和常规列或存储生成列的组合上创建。包含虚拟列的二级索引可以定义为UNIQUE.

当在虚拟生成列上创建二级索引时，生成列值将具体化在索引的记录中。如果索引是覆盖索引（包含查询检索到的所有列的索引），则生成的列值是从索引结构中的物化值中检索的，而不是“即时”计算的。

INSERT由于在和 UPDATE操作期间在二级索引记录中具体化虚拟列值时执行的计算，在虚拟列上使用二级索引时需要考虑额外的写入成本。即使有额外的写入成本，虚拟列上的二级索引也可能比生成的存储列更可取，后者在聚集索引中具体化，从而导致需要更多磁盘空间和内存的更大表。如果未在虚拟列上定义二级索引，则会产生额外的读取成本，因为每次检查列的行时都必须计算虚拟列值。

索引虚拟列的值记录在 MVCC 中，以避免在回滚或清除操作期间对生成的列值进行不必要的重新计算。记录值的数据长度受索引键限制，行格式为 767 字节， COMPACT行REDUNDANT格式为 3072 字节。 DYNAMICCOMPRESSED

在虚拟列上添加或删除二级索引是就地操作。

在 5.7.16 之前，外键约束不能引用在虚拟生成列上定义的二级索引。

在 MySQL 5.7.13 及更早版本中，InnoDB不允许在索引生成的虚拟列的基列上定义具有级联引用操作的外键约束。MySQL 5.7.14 中取消了此限制。

索引生成的列以提供 JSON 列索引

正如别处所指出的，JSON 不能直接对列进行索引。要创建间接引用此类列的索引，您可以定义一个生成的列来提取应建立索引的信息，然后在生成的列上创建索引，如本例所示：

mysql> CREATE TABLE jemp (
    ->     c JSON,
    ->     g INT GENERATED ALWAYS AS (c->"$.id"),
    ->     INDEX i (g)
    -> );
Query OK, 0 rows affected (0.28 sec)

mysql> INSERT INTO jemp (c) VALUES
     >   ('{"id": "1", "name": "Fred"}'), ('{"id": "2", "name": "Wilma"}'),
     >   ('{"id": "3", "name": "Barney"}'), ('{"id": "4", "name": "Betty"}');
Query OK, 4 rows affected (0.04 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT c->>"$.name" AS name
     >     FROM jemp WHERE g > 2;
+--------+
| name   |
+--------+
| Barney |
| Betty  |
+--------+
2 rows in set (0.00 sec)

mysql> EXPLAIN SELECT c->>"$.name" AS name
     >    FROM jemp WHERE g > 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: jemp
   partitions: NULL
         type: range
possible_keys: i
          key: i
      key_len: 5
          ref: NULL
         rows: 2
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select json_unquote(json_extract(`test`.`jemp`.`c`,'$.name'))
AS `name` from `test`.`jemp` where (`test`.`jemp`.`g` > 2)
1 row in set (0.00 sec)

（我们包装了本例中最后一条语句的输出以适合查看区域。）

The -> operator is supported in MySQL 5.7.9 and later. The ->> operator is supported beginning with MySQL 5.7.13.

When you use EXPLAIN on a SELECT or other SQL statement containing one or more expressions that use the -> or ->> operator, these expressions are translated into their equivalents using JSON_EXTRACT() and (if needed) JSON_UNQUOTE() instead, as shown here in the output from SHOW WARNINGS immediately following this EXPLAIN statement:

mysql> EXPLAIN SELECT c->>"$.name"
     > FROM jemp WHERE g > 2 ORDER BY c->"$.name"\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: jemp
   partitions: NULL
         type: range
possible_keys: i
          key: i
      key_len: 5
          ref: NULL
         rows: 2
     filtered: 100.00
        Extra: Using where; Using filesort
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select json_unquote(json_extract(`test`.`jemp`.`c`,'$.name')) AS
`c->>"$.name"` from `test`.`jemp` where (`test`.`jemp`.`g` > 2) order by
json_extract(`test`.`jemp`.`c`,'$.name')
1 row in set (0.00 sec)

See the descriptions of the -> and ->> operators, as well as those of the JSON_EXTRACT() and JSON_UNQUOTE() functions, for additional information and examples.

This technique also can be used to provide indexes that indirectly reference columns of other types that cannot be indexed directly, such as GEOMETRY columns.

JSON columns and indirect indexing in NDB Cluster

It is also possible to use indirect indexing of JSON columns in MySQL NDB Cluster, subject to the following conditions:

NDB handles a JSON column value internally as a BLOB. This means that any NDB table having one or more JSON columns must have a primary key, else it cannot be recorded in the binary log.
The NDB storage engine does not support indexing of virtual columns. Since the default for generated columns is VIRTUAL, you must specify explicitly the generated column to which to apply the indirect index as STORED.

CREATE TABLE用于创建此处显示的表的语句jempn是先前显示的表的一个版本jemp，并进行了修改使其兼容 NDB：

CREATE TABLE jempn (
  a BIGINT(20) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  c JSON DEFAULT NULL,
  g INT GENERATED ALWAYS AS (c->"$.name") STORED,
  INDEX i (g)
) ENGINE=NDB;

我们可以使用以下 INSERT语句填充此表：

INSERT INTO jempn (a, c) VALUES
  (NULL, '{"id": "1", "name": "Fred"}'),
  (NULL, '{"id": "2", "name": "Wilma"}'),
  (NULL, '{"id": "3", "name": "Barney"}'),
  (NULL, '{"id": "4", "name": "Betty"}');

现在NDB可以使用 index i，如下所示：

mysql> EXPLAIN SELECT c->>"$.name" AS name
          FROM jempn WHERE g > 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: jempn
   partitions: p0,p1
         type: range
possible_keys: i
          key: i
      key_len: 5
          ref: NULL
         rows: 3
     filtered: 100.00
        Extra: Using where with pushed condition (`test`.`jempn`.`g` > 2)
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select
json_unquote(json_extract(`test`.`jempn`.`c`,'$.name')) AS `name` from
`test`.`jempn` where (`test`.`jempn`.`g` > 2)
1 row in set (0.00 sec)

您应该记住，存储的生成列以及此类列上的任何索引都使用 DataMemory. 在 NDB 7.5 中，存储的生成列上的索引也使用 IndexMemory.