首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Server 2017 AlwaysOn故障转移群集问题

Server 2017 AlwaysOn故障转移群集问题
EN

Database Administration用户
提问于 2021-11-10 17:48:53
回答 1查看 598关注 0票数 1

我有两个Windows 2012R2 VM在Vmware ESXi 7上运行,VMDK作为SQL磁盘。

我在Server2012R2上运行了一个SQL2017 AlwaysOn集群。

没有任何正在运行的备份(快照)

PROD : 10.10.22.1和10.10.22.2心跳: 192.168.200.2和192.168.200.3已定义的文件共享见证

代码语言:javascript
复制
CrossSubnetDelay          : 1000
CrossSubnetThreshold      : 20
PlumbAllCrossSubnetRoutes : 0
SameSubnetDelay           : 1000
SameSubnetThreshold       : 10

获取集群的fl历史

代码语言:javascript
复制
RouteHistoryLength : 20

日志:

代码语言:javascript
复制
Cluster has missed two consecutive heartbeats for the local endpoint 10.10.22.1~3343~ connected to remote endpoint 10.10.22.2:~3343
Cluster has missed two consecutive heartbeats for the local endpoint 192.168.200.2:~3343~ connected to remote endpoint 192.168.200.3:~3343~.
Clustered role 'sqlcluster' is moving to cluster node 'host02'.

FailoverCluster日志:

代码语言:javascript
复制
Cluster node 'host01' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
Cluster node 'host02' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
Cluster resource 'sqlcluster' of type 'SQL Server Availability Group' in clustered role 'sqlcluster' failed.
The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk. 
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
Cluster resource 'File Share Witness' of type 'File Share Witness' in clustered role 'Cluster Group' failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

系统日志:

代码语言:javascript
复制
Unable to update the IP address on Isatap interface isatap.{1DF84235-46E7-44DE-BD8F-5A80FD1BD3BD}. Update Type: 0. Error Code: 0x57.
EN

回答 1

Database Administration用户

发布于 2021-11-11 11:17:43

这些错误描述性很强。通常情况下,当网络--不管你在这个概念中放了什么--物理硬件,或者在VMWare内部虚拟化,或者相关的驱动程序--都有问题,节点正在失去彼此的通信。和/或分档证人正在下降。这使得它们跳过心跳,集群失去仲裁并脱机。

要排除故障,可以尝试如下:

  • 运行“验证配置”向导以检查网络配置。
  • 检查/诊断物理硬件
  • 更新/修补网络驱动程序和固件
  • 更新/修补VMWare到最新版本
  • 运行其他网络诊断软件
  • 增加故障转移群集管理和可用性组设置中的心跳设置/超时间隔
票数 0
EN
页面原文内容由Database Administration提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://dba.stackexchange.com/questions/302427

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档