Documentation Home
MySQL 外壳 8.0  / 第 8 章 MySQL InnoDB ClusterSet  / 8.9 InnoDB ClusterSet修复和重新加入  /  8.9.1 InnoDB ClusterSet 中的 Fencing 集群

8.9.1 InnoDB ClusterSet 中的 Fencing 集群

紧急故障转移之后,并且存在事务集在 ClusterSet 的各个部分之间不同的风险,您必须隔离集群以防止写入流量或所有流量。

如果发生网络分区,则可能会出现脑裂情况,其中实例失去同步并且无法正确通信以定义同步状态。当 DBA 决定强行选择一个副本集群成为主集群创建多个主节点时,可能会发生裂脑,从而导致裂脑情况。

在这种情况下,DBA 可以选择隔离原始主集群:

  • 写。

  • 所有流量。

可以使用三种防护操作:

  • <Cluster>.fenceWrites():停止向 ClusterSet 的主集群写入流量。副本集群不接受写入,所以这个操作对它们没有影响。

    从 8.0.31 开始,可以在 INVALIDATED 副本集群上使用。此外,如果针对 super_read_only禁用的副本集群运行,它将启用它。

  • <Cluster>.unfenceWrites():恢复写入流量。此操作可以在先前使用该 <Cluster>.fenceWrites()操作隔离写入流量的集群上运行。

    无法 cluster.unfenceWrites() 在副本集群上使用。

  • <Cluster>.fenceAllTraffic():隔离所有流量的集群。如果您使用 隔离了集群的所有流量 ,则必须使用 MySQL Shell 命令 <Cluster>.fenceAllTraffic()重新启动集群 。dba.rebootClusterFromCompleteOutage()

    有关更多信息 dba.rebootClusterFromCompleteOutage(),请参阅第 7.8.3 节,“从主要中断中重启集群”

栅栏写入()

在副本集群.fenceWrites()上发布会返回错误:

Press CTRL+C to copy
ERROR: Unable to fence Cluster from write traffic: operation not permitted on REPLICA Clusters Cluster.fenceWrites: The Cluster '<Cluster>' is a REPLICA Cluster of the ClusterSet '<ClusterSet>' (MYSQLSH 51616)

即使您主要在属于集群集的集群上使用防护,也可以使用<Cluster>.fenceAllTraffic().

  1. 要隔离主集群的写入流量,请使用 Cluster.fenceWrites 命令,如下所示:

    Press CTRL+C to copy
    <Cluster>.fenceWrites()

    运行命令后:

    • super_read_only集群上 的自动管理被禁用。

    • super_read_only在集群中的所有实例上启用。

    • 所有应用程序都被阻止在集群上执行写入。

    Press CTRL+C to copy
    cluster.fenceWrites() The Cluster 'primary' will be fenced from write traffic * Disabling automatic super_read_only management on the Cluster... * Enabling super_read_only on '127.0.0.1:3311'... * Enabling super_read_only on '127.0.0.1:3312'... * Enabling super_read_only on '127.0.0.1:3313'... NOTE: Applications will now be blocked from performing writes on Cluster 'primary'. Use <Cluster>.unfenceWrites() to resume writes if you are certain a split-brain is not in effect. Cluster successfully fenced from write traffic
  2. 要检查您是否已将主集群隔离在写入流量之外,请使用以下<Cluster>.status命令:

    Press CTRL+C to copy
    <Cluster>.clusterset.status()

    输出如下:

    Press CTRL+C to copy
    clusterset.status() { "clusters": { "primary": { "clusterErrors": [ "WARNING: Cluster is fenced from Write traffic. Use cluster.unfenceWrites() to unfence the Cluster." ], "clusterRole": "PRIMARY", "globalStatus": "OK_FENCED_WRITES", "primary": null, "status": "FENCED_WRITES", "statusText": "Cluster is fenced from Write Traffic." }, "replica": { "clusterRole": "REPLICA", "clusterSetReplicationStatus": "OK", "globalStatus": "OK" } }, "domainName": "primary", "globalPrimaryInstance": null, "primaryCluster": "primary", "status": "UNAVAILABLE", "statusText": "Primary Cluster is fenced from write traffic."
  3. 要取消对集群的防护并恢复对主集群的写入流量,请使用 Cluster.fenceWrites 命令,如下所示:

    Press CTRL+C to copy
    <Cluster>.unfenceWrites()

    主集群自动super_read_only管理开启, super_read_only主集群实例状态。

    Press CTRL+C to copy
    cluster.unfenceWrites() The Cluster 'primary' will be unfenced from write traffic * Enabling automatic super_read_only management on the Cluster... * Disabling super_read_only on the primary '127.0.0.1:3311'... Cluster successfully unfenced from write traffic
  4. 要隔离集群中的所有流量,请使用 Cluster.fenceAllTraffic 命令,如下所示:

    Press CTRL+C to copy
    <Cluster>.fenceAllTraffic()

    状态在集群实例的super_read_only主实例上启用。offline_mode在集群中的所有实例上 启用之前 :

    Press CTRL+C to copy
    cluster.fenceAllTraffic() The Cluster 'primary' will be fenced from all traffic * Enabling super_read_only on the primary '127.0.0.1:3311'... * Enabling offline_mode on the primary '127.0.0.1:3311'... * Enabling offline_mode on '127.0.0.1:3312'... * Stopping Group Replication on '127.0.0.1:3312'... * Enabling offline_mode on '127.0.0.1:3313'... * Stopping Group Replication on '127.0.0.1:3313'... * Stopping Group Replication on the primary '127.0.0.1:3311'... Cluster successfully fenced from all traffic
  5. 要从所有流量中取消对集群的防护,请使用 dba.rebootClusterFromCompleteOutage()MySQL Shell 命令。恢复集群后,当系统询问您是否要将实例 重新加入集群时,您可以通过选择 Y将实例重新加入集群:

    Press CTRL+C to copy
    cluster = dba.rebootClusterFromCompleteOutage() Restoring the cluster 'primary' from complete outage... The instance '127.0.0.1:3312' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y/N]: Y The instance '127.0.0.1:3313' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y/N]: Y * Waiting for seed instance to become ONLINE... 127.0.0.1:3311 was restored. Rejoining '127.0.0.1:3312' to the cluster. Rejoining instance '127.0.0.1:3312' to cluster 'primary'... The instance '127.0.0.1:3312' was successfully rejoined to the cluster. Rejoining '127.0.0.1:3313' to the cluster. Rejoining instance '127.0.0.1:3313' to cluster 'primary'... The instance '127.0.0.1:3313' was successfully rejoined to the cluster. The cluster was successfully rebooted. <Cluster:primary>