首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >心脏起搏器Ipaddrr2 findIf衰竭

心脏起搏器Ipaddrr2 findIf衰竭
EN

Server Fault用户
提问于 2019-09-12 07:05:39
回答 1查看 690关注 0票数 0

我使用心脏起搏器和cor产建立了一个有3个节点的集群。当我有意将网络电缆拉回其中一个节点B (作为灾难恢复测试)时,节点AC会接收VIP,但当我在某个时候将电缆放回B时,VIP就会切换到B,这不应该是这样的。

我希望AC能留住VIP,下面是我的起搏器配置

代码语言:javascript
复制
configure
primitive baseos-ping-check ocf:pacemaker:ping params host_list="1.2.3.4" multiplier="1000" dampen="0" attempts="2" \
        op start interval="0s" timeout="60s" \
        op monitor interval="2s" timeout="60s" \
        op stop interval="0s" timeout="60s" on-fail="ignore"
primitive baseos-vip-master ocf:heartbeat:IPaddr2 \
        params ip="192.67.23.145" iflabel="MR" cidr_netmask="255.255.255.0" \
        op start interval="0s" \
        op monitor interval="10s" \
        op stop interval="0s"
clone cl_baseos-ping-check baseos-ping-check meta interleave="true"
location loc-vip-master vip-master \
        rule $id="loc-vip-master-rule" $role="master" 100: #uname eq ECS01 \
        rule $id="loc--vip-master-rule-0" $role="master" -inf: not_defined pingd or pingd lte 0
property expected-quorum-votes="1"
property stonith-enabled="false"
property maintenance-mode="false"
property cluster-recheck-interval="5min"
property default-action-timeout="60s"
property pe-error-series-max="500"
property pe-input-series-max="500"
property pe-warn-series-max="500"
property no-quorum-policy="ignore"
property dc-version="1.1.16-94ff4df"
property cluster-infrastructure="corosync"
rsc_defaults resource-stickiness="150"
rsc_defaults migration-threshold="3"
commit
quit

我的cor产c配置看起来如下:

代码语言:javascript
复制
quorum {
    provider: corosync_votequorum
    expected_votes : 3
}


totem {
    version: 2

    # How long before declaring a token lost (ms)
    token: 3000

    # How many token retransmits before forming a new configuration
    token_retransmits_before_loss_const: 10

    # How long to wait for join messages in the membership protocol (ms)
    join: 60

    # How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
    consensus: 3600

    # Turn off the virtual synchrony filter
    vsftype: none

    # Number of messages that may be sent by one processor on receipt of the token
    max_messages: 20

    # Limit generated nodeids to 31-bits (positive signed integers)
    clear_node_high_bit: yes

    # Disable encryption
    secauth: on

    # How many threads to use for encryption/decryption
    threads: 0

    # Optionally assign a fixed node id (integer)
    # nodeid: 1234

    # This specifies the mode of redundant ring, which may be none, active, or passive.
    rrp_mode: none

    interface {
        # The following values need to be set based on your environment
        ringnumber: 0
        bindnetaddr: 10.98.4.0
        #mcastaddr: 0.0.0.0
        mcastport: 5876
        member {
            memberaddr: 10.98.4.103
        }
        member {
            memberaddr: 10.98.4.173
        }
    }
    transport: udpu
}

amf {
    mode: disabled
}

service {
    # Load the Pacemaker Cluster Resource Manager
    ver:       0
    name:      pacemaker
}

aisexec {
        user:   root
        group:  root
}

logging {
        fileline: off
        to_stderr: yes
        to_logfile: no
        to_syslog: yes
        syslog_facility: daemon
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
                tags: enter|leave|trace1|trace2|trace3|trace4|trace6
        }
}

我的cib.xml如下所示:

代码语言:javascript
复制

我上面描述的场景只有在我拉一个节点的网络电缆使它离线时才会发生,但是如果我重新启动节点(即B),那么VIP就会粘附在当前节点上,即AC

我注意到,当我将Node B的网络电缆放回时,IPaddr2资源正在调用失败的findif,因为我没有使用nic名称参数,但我确实提供了cidr_netmask,所以理想情况下,findif应该解析节点B的ip地址。

有什么办法可以避免findif失败吗?

EN

回答 1

Server Fault用户

发布于 2019-09-13 17:59:58

正如我们在您的问题下的注释中所指出的:当节点重新加入集群时,会发现VIP在多个节点上运行,因此集群必须恢复服务(在任何地方停止VIP,然后启动它),而它恰好是选择节点B。

在生产集群中,您将使用栅栏/STONITH,而不会忽略仲裁。当您在该配置中将节点B从网络中拔出时,带外STONITH代理将强制关闭节点B,从而导致节点B以“新状态”重新加入群集,没有运行任何服务,并且VIP不会故障回节点B。

票数 0
EN
页面原文内容由Server Fault提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://serverfault.com/questions/983932

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档