我有一个MX960框,它面临如下问题。Mater (RE1)不能在备份RE (RE0)上显示任何命令调用,例如:在所有路由引擎上显示版本调用。
{master}
nms@MX960> request routing-engine login other-routing-engine
connect to address 128.0.0.4: Operation timed out
Trying 10.0.0.4...
re0: Operation timed out
connect to address 128.0.0.4: Operation timed out
Trying 10.0.0.4...
re0: Operation timed out
####show database-replication summary
{master}
nms@MX960> show database-replication summary
General:
Graceful Restart Enabled
Mastership Master
**Connection Down**
Database Available
Message Queue Not Ready
####show task replication
{master}
nms@MX960> show task replication
Stateful Replication: Enabled
RE mode: Master
Protocol Synchronization Status
OSPF NotStarted
BGP NotStarted
MPLS NotStarted
RSVP NotStarted
LDP NotStarted 我检查了备份RE (RE0)上的日志,并得到了ESW端口13 (接口连接、SCB0和RE0)多次切换的接口。但是,它又启动了,但是在RE1中,我仍然不能显示任何命令调用- (RE0)。

从主机RE,显示底盘路由引擎,我看到所有的REs (RE0是备份,RE1是主)。但是从备份RE,显示底盘路由引擎,主RE的状态是存在的.
**From RE0 (backup RE)**
viettel@KHA0011PRV01> show chassis routing-engine
Routing Engine status:
Slot 0:
Current state Backup
Election priority Master (default)
Temperature 31 degrees C / 87 degrees F
CPU temperature 30 degrees C / 86 degrees F
DRAM 16329 MB (16384 MB installed)
Memory utilization 9 percent
5 sec CPU utilization:
User 0 percent
Background 0 percent
Kernel 0 percent
Interrupt 0 percent
Idle 100 percent
1 min CPU utilization:
User 0 percent
Background 0 percent
Kernel 0 percent
Interrupt 0 percent
Idle 100 percent
5 min CPU utilization:
User 0 percent
Background 0 percent
Kernel 0 percent
Interrupt 0 percent
Idle 100 percent
15 min CPU utilization:
User 0 percent
Background 0 percent
Kernel 0 percent
Interrupt 0 percent
Idle 100 percent
Model RE-S-1800x4
Serial ID 9009219924
Start time 2022-08-18 11:39:45 ICT
Uptime 98 days, 22 hours, 4 minutes, 22 seconds
Last reboot reason Router rebooted after a normal shutdown.
Load averages: 1 minute 5 minute 15 minute
0.02 0.11 0.14
Routing Engine status:
Slot 1:
Current state Present
**From RE1 (master RE)**
{master}
nms@KHA0011PRV01> show chassis routing-engine
Nov 25 09:43:52
Routing Engine status:
Slot 0:
Current state Backup
Election priority Master (default)
Temperature 31 degrees C / 87 degrees F
CPU temperature 30 degrees C / 86 degrees F
DRAM 16329 MB (16384 MB installed)
Memory utilization 9 percent
5 sec CPU utilization:
User 0 percent
Background 0 percent
Kernel 0 percent
Interrupt 0 percent
Idle 99 percent
Model RE-S-1800x4
Serial ID 9009219924
Start time 2022-08-18 11:39:45 ICT
Uptime 98 days, 22 hours, 3 minutes, 17 seconds
Last reboot reason Router rebooted after a normal shutdown.
Load averages: 1 minute 5 minute 15 minute
0.11 0.13 0.15
Routing Engine status:
Slot 1:
Current state Master
Election priority Backup (default)
Temperature 32 degrees C / 89 degrees F
CPU temperature 30 degrees C / 86 degrees F
DRAM 16329 MB (16384 MB installed)
Memory utilization 9 percent
5 sec CPU utilization:
User 0 percent
Background 0 percent
Kernel 2 percent
Interrupt 0 percent
Idle 98 percent
1 min CPU utilization:
User 0 percent
Background 0 percent
Kernel 2 percent
Interrupt 0 percent
Idle 98 percent
5 min CPU utilization:
User 0 percent
Background 0 percent
Kernel 2 percent
Interrupt 0 percent
Idle 98 percent
15 min CPU utilization:
User 0 percent
Background 0 percent
Kernel 2 percent
Interrupt 0 percent
Idle 98 percent
Model RE-S-1800x4
Serial ID 9009219890
Start time 2022-02-10 00:50:35 ICT
Uptime 288 days, 8 hours, 53 minutes, 13 seconds
Last reboot reason 0x1:power cycle/failure
Load averages: 1 minute 5 minute 15 minute
0.14 0.23 0.23在我重新启动RE0之后,一切都好。有谁面临同样的问题吗?请给我一些你的观点。谢谢。
发布于 2022-11-28 07:57:46
我检查了RE0 (备份RE)的日志chassisd,在端口13切换之前没有发生特殊事件。端口13瓣多次。但在下面的时间里,我观察到奇怪的日志,我不断地推。
Nov 20 12:33:55 ch_gencfg_chassis_startup_time_handler: master_re: false, GENCFG_CHASSIS_STARTUP_TIME, minor_type: 8
Nov 20 12:33:55 ch_gencfg_chassis_startup_time_blob_get: chassis startup time from kernel is 1644429032.323756
Nov 20 12:33:55 ch_gencfg_chassis_startup_time_handler: wrote hw.chassis.startup_time as 1644429032.323756
Nov 20 13:33:53 ch_gencfg_chassis_startup_time_handler: master_re: false, GENCFG_CHASSIS_STARTUP_TIME, minor_type: 8
Nov 20 13:33:53 ch_gencfg_chassis_startup_time_blob_get: chassis startup time from kernel is 1644429032.323753
Nov 20 13:33:53 ch_gencfg_chassis_startup_time_handler: wrote hw.chassis.startup_time as 1644429032.323753
Nov 20 13:48:25 send: red alarm set, device Routing Engine 0, reason Host 0 em0 : Ethernet Link to PFEs Down
Nov 20 13:48:30 acb_update_local_esw_status: ESW port 13 (connected to RE-GigE) link down
Nov 20 13:48:30 send: red alarm clear, device Routing Engine 0, reason Host 0 em0 : Ethernet Link to PFEs Down
Nov 20 13:48:33 fru_is_present: out of range slot 1 for
Nov 20 13:48:43 acb_update_local_esw_status: ESW port 13 (connected to RE-GigE) link up
Nov 20 13:48:46 fru_is_present: out of range slot 1 for
Nov 20 13:48:59 fru_is_present: out of range slot 1 for
Nov 20 13:49:13 fru_is_present: out of range slot 1 for
Nov 20 13:49:26 fru_is_present: out of range slot 1 for 日志"fru_is_present:超出范围插槽1用于“不断推送,直到我们重新启动RE0。正常情况下,在RE0联机之后,我们只观察到以下日志:
Nov 28 12:33:54 ch_gencfg_chassis_startup_time_handler: master_re: false, GENCFG_CHASSIS_STARTUP_TIME, minor_type: 8
Nov 28 12:33:54 ch_gencfg_chassis_startup_time_blob_get: chassis startup time from kernel is 1644429032.323070
Nov 28 12:33:54 ch_gencfg_chassis_startup_time_handler: wrote hw.chassis.startup_time as 1644429032.323070
Nov 28 13:33:53 ch_gencfg_chassis_startup_time_handler: master_re: false, GENCFG_CHASSIS_STARTUP_TIME, minor_type: 8
Nov 28 13:33:53 ch_gencfg_chassis_startup_time_blob_get: chassis startup time from kernel is 1644429032.323067
Nov 28 13:33:53 ch_gencfg_chassis_startup_time_handler: wrote hw.chassis.startup_time as 1644429032.323067
Nov 28 14:33:52 ch_gencfg_chassis_startup_time_handler: master_re: false, GENCFG_CHASSIS_STARTUP_TIME, minor_type: 8
Nov 28 14:33:52 ch_gencfg_chassis_startup_time_blob_get: chassis startup time from kernel is 1644429032.323064
Nov 28 14:33:52 ch_gencfg_chassis_startup_time_handler: wrote hw.chassis.startup_time as 1644429032.323064发布于 2022-12-28 15:15:31
看上去像是某种软件缺陷。我会为你的设备向支援小组打开一个箱子。
https://networkengineering.stackexchange.com/questions/80660
复制相似问题