Documentation Home
MySQL 8.0 参考手册  / 第 6 章 安全  / 6.5 MySQL 企业数据屏蔽和去标识化  /  6.5.5 MySQL企业数据脱敏和去标识化功能说明

6.5.5 MySQL企业数据脱敏和去标识化功能说明

MySQL Enterprise Data Masking and De-Identification 插件库包括几个函数,可以分为以下几类:

这些函数将字符串参数视为二进制字符串(这意味着它们不区分字母大小写),并且字符串返回值是二进制字符串。如果一个字符串返回值应该在不同的字符集中,转换它。以下示例显示如何将 的结果转换为 gen_rnd_email()字符 utf8mb4集:

SET @email = CONVERT(gen_rnd_email() USING utf8mb4);

可能还需要转换字符串参数,如 使用掩码数据进行客户识别中所示。

如果从 MySQL客户端中调用 MySQL 企业数据屏蔽和去标识化功能,二进制字符串结果将使用十六进制表示法显示,具体取决于 --binary-as-hex. 有关该选项的更多信息,请参阅第 4.5.1 节,“mysql — MySQL 命令行客户端”

数据屏蔽功能

本节中的每个函数对其字符串参数执行屏蔽操作并返回屏蔽结果。

  • mask_inner(str, margin1, margin2 [, mask_char])

    屏蔽字符串的内部部分,保留两端不变,并返回结果。可以指定一个可选的屏蔽字符。

    参数:

    • str: 要屏蔽的字符串。

    • margin1:一个非负整数,指定字符串左端要保持未屏蔽的字符数。如果该值为 0,则没有左端字符保持未屏蔽状态。

    • margin2:一个非负整数,指定字符串右端要保持未屏蔽的字符数。如果该值为 0,则没有右端字符保持未屏蔽状态。

    • mask_char:(可选)用于屏蔽的单个字符。'X'如果 mask_char没有给出 默认值 。

      屏蔽字符必须是单字节字符。尝试使用多字节字符会产生错误。

    返回值:

    被屏蔽的字符串,或者NULL任何一个边距是否为负。

    如果 margin 值的总和大于参数长度,则不会发生屏蔽并且参数将原封不动地返回。

    例子:

    mysql> SELECT mask_inner('abcdef', 1, 2), mask_inner('abcdef',0, 5);
    +----------------------------+---------------------------+
    | mask_inner('abcdef', 1, 2) | mask_inner('abcdef',0, 5) |
    +----------------------------+---------------------------+
    | aXXXef                     | Xbcdef                    |
    +----------------------------+---------------------------+
    mysql> SELECT mask_inner('abcdef', 1, 2, '*'), mask_inner('abcdef',0, 5, '#');
    +---------------------------------+--------------------------------+
    | mask_inner('abcdef', 1, 2, '*') | mask_inner('abcdef',0, 5, '#') |
    +---------------------------------+--------------------------------+
    | a***ef                          | #bcdef                         |
    +---------------------------------+--------------------------------+
  • mask_outer(str, margin1, margin2 [, mask_char])

    屏蔽字符串的左右两端,不屏蔽内部,并返回结果。可以指定一个可选的屏蔽字符。

    参数:

    • str: 要屏蔽的字符串。

    • margin1:一个非负整数,指定要屏蔽的字符串左端的字符数。如果该值为 0,则不屏蔽左端字符。

    • margin2:一个非负整数,指定要屏蔽的字符串右端的字符数。如果该值为 0,则不屏蔽任何右端字符。

    • mask_char:(可选)用于屏蔽的单个字符。'X'如果 mask_char没有给出 默认值 。

      屏蔽字符必须是单字节字符。尝试使用多字节字符会产生错误。

    返回值:

    被屏蔽的字符串,或者NULL任何一个边距是否为负。

    如果边距值的总和大于参数长度,则整个参数都被屏蔽。

    例子:

    mysql> SELECT mask_outer('abcdef', 1, 2), mask_outer('abcdef',0, 5);
    +----------------------------+---------------------------+
    | mask_outer('abcdef', 1, 2) | mask_outer('abcdef',0, 5) |
    +----------------------------+---------------------------+
    | XbcdXX                     | aXXXXX                    |
    +----------------------------+---------------------------+
    mysql> SELECT mask_outer('abcdef', 1, 2, '*'), mask_outer('abcdef',0, 5, '#');
    +---------------------------------+--------------------------------+
    | mask_outer('abcdef', 1, 2, '*') | mask_outer('abcdef',0, 5, '#') |
    +---------------------------------+--------------------------------+
    | *bcd**                          | a#####                         |
    +---------------------------------+--------------------------------+
  • mask_pan(str)

    屏蔽支付卡主帐号并返回除最后四位以外的所有数字都被 'X'字符替换的号码。

    参数:

    • str: 要屏蔽的字符串。该字符串的长度必须适合主帐号,但不进行其他检查。

    返回值:

    作为字符串的屏蔽支付号码。如果参数比要求的短,则原样返回。

    例子:

    mysql> SELECT mask_pan(gen_rnd_pan());
    +-------------------------+
    | mask_pan(gen_rnd_pan()) |
    +-------------------------+
    | XXXXXXXXXXXX9102        |
    +-------------------------+
    mysql> SELECT mask_pan(gen_rnd_pan(19));
    +---------------------------+
    | mask_pan(gen_rnd_pan(19)) |
    +---------------------------+
    | XXXXXXXXXXXXXXX8268       |
    +---------------------------+
    mysql> SELECT mask_pan('a*Z');
    +-----------------+
    | mask_pan('a*Z') |
    +-----------------+
    | a*Z             |
    +-----------------+
  • mask_pan_relaxed(str)

    屏蔽支付卡主帐号并返回除前六位和后四位以外的所有数字都替换为'X'字符的号码。前六位数字表示支付卡发行机构。

    参数:

    • str: 要屏蔽的字符串。该字符串的长度必须适合主帐号,但不进行其他检查。

    返回值:

    作为字符串的屏蔽支付号码。如果参数比要求的短,则原样返回。

    例子:

    mysql> SELECT mask_pan_relaxed(gen_rnd_pan());
    +---------------------------------+
    | mask_pan_relaxed(gen_rnd_pan()) |
    +---------------------------------+
    | 551279XXXXXX3108                |
    +---------------------------------+
    mysql> SELECT mask_pan_relaxed(gen_rnd_pan(19));
    +-----------------------------------+
    | mask_pan_relaxed(gen_rnd_pan(19)) |
    +-----------------------------------+
    | 462634XXXXXXXXX6739               |
    +-----------------------------------+
    mysql> SELECT mask_pan_relaxed('a*Z');
    +-------------------------+
    | mask_pan_relaxed('a*Z') |
    +-------------------------+
    | a*Z                     |
    +-------------------------+
  • mask_ssn(str)

    屏蔽美国社会安全号码并返回除最后四位以外的所有数字都被 'X'字符替换的号码。

    参数:

    • str: 要屏蔽的字符串。该字符串的长度必须为 11 个字符,但不进行其他检查。

    返回值:

    屏蔽的社会安全号码作为字符串,或者 NULL如果参数的长度不正确。

    例子:

    mysql> SELECT mask_ssn('909-63-6922'), mask_ssn('abcdefghijk');
    +-------------------------+-------------------------+
    | mask_ssn('909-63-6922') | mask_ssn('abcdefghijk') |
    +-------------------------+-------------------------+
    | XXX-XX-6922             | XXX-XX-hijk             |
    +-------------------------+-------------------------+
    mysql> SELECT mask_ssn('909');
    +-----------------+
    | mask_ssn('909') |
    +-----------------+
    | NULL            |
    +-----------------+

随机数据生成函数

本节中的函数为不同类型的数据生成随机值。如果可能,生成的值具有保留用于演示或测试值的特征,以避免将它们误认为合法数据。例如, gen_rnd_us_phone()返回一个使用555区号的美国电话号码,在实际使用中并没有分配给电话号码。个别功能描述描述了此原则的任何例外情况。

  • gen_range(lower, upper)

    生成从指定范围中选择的随机数。

    参数:

    • lower: 一个整数,指定范围的下边界。

    • upper:一个整数,指定范围的上限,不能小于下限。

    返回值:

    范围从 lower到 的随机整数upper(含),或者 NULL如果 upper参数小于 lower.

    例子:

    mysql> SELECT gen_range(100, 200), gen_range(-1000, -800);
    +---------------------+------------------------+
    | gen_range(100, 200) | gen_range(-1000, -800) |
    +---------------------+------------------------+
    |                 177 |                   -917 |
    +---------------------+------------------------+
    mysql> SELECT gen_range(1, 0);
    +-----------------+
    | gen_range(1, 0) |
    +-----------------+
    |            NULL |
    +-----------------+
  • gen_rnd_email()

    example.com在域中 生成一个随机电子邮件地址 。

    参数:

    没有任何。

    返回值:

    作为字符串的随机电子邮件地址。

    例子:

    mysql> SELECT gen_rnd_email();
    +---------------------------+
    | gen_rnd_email()           |
    +---------------------------+
    | ijocv.mwvhhuf@example.com |
    +---------------------------+
  • gen_rnd_pan([size])

    生成随机支付卡主帐号。该号码通过了 Luhn 校验(一种针对校验位执行校验和验证的算法)。

    警告

    从返回的值 gen_rnd_pan()应仅用于测试目的,不适合发布。无法保证给定的返回值不会分配给合法的支付账户。如果有必要发布 gen_rnd_pan()结果,请考虑使用 mask_pan()或 对其进行屏蔽mask_pan_relaxed()

    参数:

    • size:(可选)指定结果大小的整数。size如果未给出,则默认值为 16 。如果给定,size则必须是 12 到 19 范围内的整数。

    返回值:

    A random payment number as a string, or NULL if a size argument outside the permitted range is given.

    Example:

    mysql> SELECT mask_pan(gen_rnd_pan());
    +-------------------------+
    | mask_pan(gen_rnd_pan()) |
    +-------------------------+
    | XXXXXXXXXXXX5805        |
    +-------------------------+
    mysql> SELECT mask_pan(gen_rnd_pan(19));
    +---------------------------+
    | mask_pan(gen_rnd_pan(19)) |
    +---------------------------+
    | XXXXXXXXXXXXXXX5067       |
    +---------------------------+
    mysql> SELECT mask_pan_relaxed(gen_rnd_pan());
    +---------------------------------+
    | mask_pan_relaxed(gen_rnd_pan()) |
    +---------------------------------+
    | 398403XXXXXX9547                |
    +---------------------------------+
    mysql> SELECT mask_pan_relaxed(gen_rnd_pan(19));
    +-----------------------------------+
    | mask_pan_relaxed(gen_rnd_pan(19)) |
    +-----------------------------------+
    | 578416XXXXXXXXX6509               |
    +-----------------------------------+
    mysql> SELECT gen_rnd_pan(11), gen_rnd_pan(20);
    +-----------------+-----------------+
    | gen_rnd_pan(11) | gen_rnd_pan(20) |
    +-----------------+-----------------+
    | NULL            | NULL            |
    +-----------------+-----------------+
  • gen_rnd_ssn()

    Generates a random US Social Security number in AAA-BB-CCCC format. The AAA part is greater than 900 and the BB part is less than 70, which are characteristics not used for legitimate Social Security numbers.

    Arguments:

    None.

    Return value:

    A random Social Security number as a string.

    Example:

    mysql> SELECT gen_rnd_ssn();
    +---------------+
    | gen_rnd_ssn() |
    +---------------+
    | 951-26-0058   |
    +---------------+
  • gen_rnd_us_phone()

    Generates a random US phone number in 1-555-AAA-BBBB format. The 555 area code is not used for legitimate phone numbers.

    Arguments:

    None.

    Return value:

    A random US phone number as a string.

    Example:

    mysql> SELECT gen_rnd_us_phone();
    +--------------------+
    | gen_rnd_us_phone() |
    +--------------------+
    | 1-555-682-5423     |
    +--------------------+

Random Data Dictionary-Based Functions

The functions in this section manipulate dictionaries of terms and perform generation and masking operations based on them. Some of these functions require the SUPER privilege.

When a dictionary is loaded, it becomes part of the dictionary registry and is assigned a name to be used by other dictionary functions. Dictionaries are loaded from plain text files containing one term per line. Empty lines are ignored. To be valid, a dictionary file must contain at least one nonempty line.

  • gen_blacklist(str, dictionary_name, replacement_dictionary_name)

    Replaces a term present in one dictionary with a term from a second dictionary and returns the replacement term. This masks the original term by substitution.

    Arguments:

    • str: A string that indicates the term to replace.

    • dictionary_name: A string that names the dictionary containing the term to replace.

    • replacement_dictionary_name: A string that names the dictionary from which to choose the replacement term.

    Return value:

    A string randomly chosen from replacement_dictionary_name as a replacement for str, or str if it does not appear in dictionary_name, or NULL if either dictionary name is not in the dictionary registry.

    If the term to replace appears in both dictionaries, it is possible for the return value to be the same term.

    Example:

    mysql> SELECT gen_blacklist('Berlin', 'DE_Cities', 'US_Cities');
    +---------------------------------------------------+
    | gen_blacklist('Berlin', 'DE_Cities', 'US_Cities') |
    +---------------------------------------------------+
    | Phoenix                                           |
    +---------------------------------------------------+
  • gen_dictionary(dictionary_name)

    Returns a random term from a dictionary.

    Arguments:

    • dictionary_name: A string that names the dictionary from which to choose the term.

    Return value:

    A random term from the dictionary as a string, or NULL if the dictionary name is not in the dictionary registry.

    Example:

    mysql> SELECT gen_dictionary('mydict');
    +--------------------------+
    | gen_dictionary('mydict') |
    +--------------------------+
    | My term                  |
    +--------------------------+
    mysql> SELECT gen_dictionary('no-such-dict');
    +--------------------------------+
    | gen_dictionary('no-such-dict') |
    +--------------------------------+
    | NULL                           |
    +--------------------------------+
  • gen_dictionary_drop(dictionary_name)

    Removes a dictionary from the dictionary registry.

    This function requires the SUPER privilege.

    Arguments:

    • dictionary_name: A string that names the dictionary to remove from the dictionary registry.

    Return value:

    A string that indicates whether the drop operation succeeded. Dictionary removed indicates success. Dictionary removal error indicates failure.

    Example:

    mysql> SELECT gen_dictionary_drop('mydict');
    +-------------------------------+
    | gen_dictionary_drop('mydict') |
    +-------------------------------+
    | Dictionary removed            |
    +-------------------------------+
    mysql> SELECT gen_dictionary_drop('no-such-dict');
    +-------------------------------------+
    | gen_dictionary_drop('no-such-dict') |
    +-------------------------------------+
    | Dictionary removal error            |
    +-------------------------------------+
  • gen_dictionary_load(dictionary_path, dictionary_name)

    Loads a file into the dictionary registry and assigns the dictionary a name to be used with other functions that require a dictionary name argument.

    This function requires the SUPER privilege.

    Important

    Dictionaries are not persistent. Any dictionary used by applications must be loaded for each server startup.

    Once loaded into the registry, a dictionary is used as is, even if the underlying dictionary file changes. To reload a dictionary, first drop it with gen_dictionary_drop(), then load it again with gen_dictionary_load().

    Arguments:

    • dictionary_path: A string that specifies the path name of the dictionary file.

    • dictionary_name:为字典提供名称的字符串。

    返回值:

    指示加载操作是否成功的字符串。Dictionary load success 表示成功。Dictionary load error 表示失败。字典加载失败的原因有多种,包括:

    • 已加载具有给定名称的字典。

    • 找不到词典文件。

    • 词典文件不包含术语。

    • 设置了secure_file_priv 系统变量,字典文件不在变量命名的目录中。

    例子:

    mysql> SELECT gen_dictionary_load('/usr/local/mysql/mysql-files/mydict','mydict');
    +---------------------------------------------------------------------+
    | gen_dictionary_load('/usr/local/mysql/mysql-files/mydict','mydict') |
    +---------------------------------------------------------------------+
    | Dictionary load success                                             |
    +---------------------------------------------------------------------+
    mysql> SELECT gen_dictionary_load('/dev/null','null');
    +-----------------------------------------+
    | gen_dictionary_load('/dev/null','null') |
    +-----------------------------------------+
    | Dictionary load error                   |
    +-----------------------------------------+