HeatWave Advisor Auto Encoding 推荐字符串列编码,现在提供优化查询性能的编码建议。建议基于使用查询执行数据的性能模型。以前,字符串列编码建议仅针对集群内存使用情况进行了优化。性能改进估计随字符串列编码建议一起提供。(缺陷号 34145862)
您现在可以在包含
DATE
、TIME
、DATETIME
、TIMESTAMP
和YEAR
数据类型的表上训练 HeatWave ML 模型。(缺陷号 33895503)-
当您训练机器学习模型时,HeatWave 机器学习现在会生成模型解释。模型解释有助于识别对模型最重要的特征。有关详细信息,请参阅模型目录。
表中添加了以下列
MODEL_CATALOG
:column_names
:用于训练模型的特征列。last_accessed
:上次访问模型的时间。HeatWave ML 例程在访问模型时将此值更新为当前时间戳。model_explanation
: 训练时生成的模型解释。model_type
:构建模型所选择的模型(算法)类型ML_TRAIN
。task
:查询中指定的任务类型ML_TRAIN
(classification
或regression
)。
ML_PREDICT_*
提高了ML_EXPLAIN_*
常规性能,从而加快了预测和解释处理速度。 -
实施了以下 HeatWave 机器学习增强功能:
-
ML_TRAIN
高级用户的选项。这些选项允许用户自定义 ML 训练管道的各个方面,包括算法选择、特征选择和超参数优化。该
model_list
选项允许指定要训练的模型类型。The
exclude_model_list
option specifies models types to exclude from consideration during model selection.The
optimization_metric
option specifies the scoring metric to optimize for when training a machine learning model.The
exclude_column_list
option specifies feature columns to exclude from consideration when training a machine learning model.
For more information, see Advanced ML_TRAIN Options.
Support was added for Support Vector Machine
SVC
andLinearSVC
classification and regression models. For a complete list of supported model types, see Model Types.The
ML_TRAIN
routine now reports a message if a trained model does not meet expected quality criteria.ML_EXPLAIN_ROW
andML_EXPLAIN_TABLE
routines now provide information to help interpret explanations. The routines also report a warning when a model quality issue is detected, enabling users to revisit their data in order to improve model quality.
-
The amount of heap memory allocated on the MySQL node for each table loaded into HeatWave was reduced, increasing the maximum number of tables that can be loaded. For
MySQL.HeatWave.VM.E3.Standard
shapes, the maximum was raised from 100k tables to 400k tables. ForMySQL.HeatWave.BM.E3.Standard
shapes, the maximum number was raised from 400k tables to 1600k tables. The actual number of tables that can be loaded is dependent on the table's data. (Bug #33951708)The
performance_schema.rpd_column_id
table was modified to remove redundant data. TheNAME
,SCHEMA_NAME
,TABLE_NAME
columns were removed, and aTABLE_ID
column was added. (Bug #33899183)Support was added for the
FROM_DAYS()
date and time function, andGREATEST()
andLEAST()
comparison functions now supportDATE
,DATETIME
,TIME
, andTIMESTAMP
columns.-
Support was added for built-in server-side data masking and de-identification to help protect sensitive data from unauthorized uses by hiding and replacing real values with substitutes. Data masking and de-identification operations are performed on the server, and queries involving data masking and de-identification functions are accelerated by HeatWave. The following data masking and de-identification functions are supported:
请参阅 数据屏蔽和去标识化功能。
实施了优化以提高 涉及多轮连续数据分区的执行计划的 性能
JOIN
和查询。GROUP BY
-
比较,其中表达式是单个值,比较值是相同数据类型和编码的常量,已被优化。例如,以下expr
IN (value
,...)IN()
比较已被优化:SELECT * FROM Customers WHERE Country IN ('Germany', 'France', 'Spain');