8.9.1 InnoDB ClusterSet 中的 Fencing 集群

MySQL 外壳 8.0 / 第 8 章 MySQL InnoDB ClusterSet / 8.9 InnoDB ClusterSet修复和重新加入 / 8.9.1 InnoDB ClusterSet 中的 Fencing 集群

8.9.1 InnoDB ClusterSet 中的 Fencing 集群

在紧急故障转移之后，并且存在事务集在 ClusterSet 的各个部分之间不同的风险，您必须隔离集群以防止写入流量或所有流量。

如果发生网络分区，则可能会出现脑裂情况，其中实例失去同步并且无法正确通信以定义同步状态。当 DBA 决定强行选择一个副本集群成为主集群创建多个主节点时，可能会发生裂脑，从而导致裂脑情况。

在这种情况下，DBA 可以选择隔离原始主集群：

写。
所有流量。

可以使用三种防护操作：

<Cluster>.fenceWrites()：停止向 ClusterSet 的主集群写入流量。副本集群不接受写入，所以这个操作对它们没有影响。

从 8.0.31 开始，可以在 INVALIDATED 副本集群上使用。此外，如果针对 super_read_only禁用的副本集群运行，它将启用它。
<Cluster>.unfenceWrites()：恢复写入流量。此操作可以在先前使用该 <Cluster>.fenceWrites()操作隔离写入流量的集群上运行。

无法 cluster.unfenceWrites() 在副本集群上使用。
<Cluster>.fenceAllTraffic()：隔离所有流量的集群。如果您使用隔离了集群的所有流量，则必须使用 MySQL Shell 命令 <Cluster>.fenceAllTraffic()重新启动集群。dba.rebootClusterFromCompleteOutage()

有关更多信息 dba.rebootClusterFromCompleteOutage()，请参阅第 7.8.3 节，“从主要中断中重启集群”。

栅栏写入（）

在副本集群.fenceWrites()上发布会返回错误：

ERROR: Unable to fence Cluster from write traffic: 
operation not permitted on REPLICA Clusters
Cluster.fenceWrites: The Cluster '<Cluster>' is a REPLICA Cluster 
of the ClusterSet '<ClusterSet>' (MYSQLSH 51616)

即使您主要在属于集群集的集群上使用防护，也可以使用<Cluster>.fenceAllTraffic().

要隔离主集群的写入流量，请使用 Cluster.fenceWrites 命令，如下所示：

        <Cluster>.fenceWrites()

运行命令后：

super_read_only集群上的自动管理被禁用。
super_read_only在集群中的所有实例上启用。
所有应用程序都被阻止在集群上执行写入。

cluster.fenceWrites()
    The Cluster 'primary' will be fenced from write traffic

	  * Disabling automatic super_read_only management on the Cluster...
	  * Enabling super_read_only on '127.0.0.1:3311'...
	  * Enabling super_read_only on '127.0.0.1:3312'...
	  * Enabling super_read_only on '127.0.0.1:3313'...

	  NOTE: Applications will now be blocked from performing writes on Cluster 'primary'. 
    Use <Cluster>.unfenceWrites() to resume writes if you are certain a split-brain is not in effect.

	  Cluster successfully fenced from write traffic

要检查您是否已将主集群隔离在写入流量之外，请使用以下<Cluster>.status命令：

      <Cluster>.clusterset.status()

输出如下：

clusterset.status()
        {
        "clusters": {
        "primary": {
        "clusterErrors": [
        "WARNING: Cluster is fenced from Write traffic. 
         Use cluster.unfenceWrites() to unfence the Cluster."
        ],
        "clusterRole": "PRIMARY",
        "globalStatus": "OK_FENCED_WRITES",
        "primary": null,
        "status": "FENCED_WRITES",
        "statusText": "Cluster is fenced from Write Traffic."
        },
        "replica": {
        "clusterRole": "REPLICA",
        "clusterSetReplicationStatus": "OK",
        "globalStatus": "OK"
        }
        },
        "domainName": "primary",
        "globalPrimaryInstance": null,
        "primaryCluster": "primary",
        "status": "UNAVAILABLE",
        "statusText": "Primary Cluster is fenced from write traffic."

要取消对集群的防护并恢复对主集群的写入流量，请使用 Cluster.fenceWrites 命令，如下所示：

        <Cluster>.unfenceWrites()

主集群自动super_read_only管理开启， super_read_only主集群实例状态。

        cluster.unfenceWrites()
        The Cluster 'primary' will be unfenced from write traffic

        * Enabling automatic super_read_only management on the Cluster...
        * Disabling super_read_only on the primary '127.0.0.1:3311'...

        Cluster successfully unfenced from write traffic

要隔离集群中的所有流量，请使用 Cluster.fenceAllTraffic 命令，如下所示：

      <Cluster>.fenceAllTraffic()

状态在集群实例的super_read_only主实例上启用。offline_mode在集群中的所有实例上启用之前：

      cluster.fenceAllTraffic()
        The Cluster 'primary' will be fenced from all traffic

        * Enabling super_read_only on the primary '127.0.0.1:3311'...
        * Enabling offline_mode on the primary '127.0.0.1:3311'...
        * Enabling offline_mode on '127.0.0.1:3312'...
        * Stopping Group Replication on '127.0.0.1:3312'...
        * Enabling offline_mode on '127.0.0.1:3313'...
        * Stopping Group Replication on '127.0.0.1:3313'...
        * Stopping Group Replication on the primary '127.0.0.1:3311'...

        Cluster successfully fenced from all traffic

要从所有流量中取消对集群的防护，请使用 dba.rebootClusterFromCompleteOutage()MySQL Shell 命令。恢复集群后，当系统询问您是否要将实例重新加入集群时，您可以通过选择 Y将实例重新加入集群：

cluster = dba.rebootClusterFromCompleteOutage()
		Restoring the cluster 'primary' from complete outage...

		The instance '127.0.0.1:3312' was part of the cluster configuration.
		Would you like to rejoin it to the cluster? [y/N]: Y

		The instance '127.0.0.1:3313' was part of the cluster configuration.
		Would you like to rejoin it to the cluster? [y/N]: Y

		* Waiting for seed instance to become ONLINE...
		127.0.0.1:3311 was restored.
		Rejoining '127.0.0.1:3312' to the cluster.
		Rejoining instance '127.0.0.1:3312' to cluster 'primary'...

		The instance '127.0.0.1:3312' was successfully rejoined to the cluster.

		Rejoining '127.0.0.1:3313' to the cluster.
		Rejoining instance '127.0.0.1:3313' to cluster 'primary'...

		The instance '127.0.0.1:3313' was successfully rejoined to the cluster.

		The cluster was successfully rebooted.

		<Cluster:primary>