12小时前,我从我的LibreNMS监控工具收到通知,我的12台MongoDB (版本3.2.11)服务器中的一个服务器上的MongoDB出现故障(连接时间超过10秒)。我决定忽略它,等待它结束,我只是觉得它有点忙。
几个小时后,当我运行db.currentOp()时,我有点担心。我看到有一个运行migrateThread的操作,它的消息是“步骤2 of 5”,还有一对插入的消息“查询不记录(太大)”。
在做了一些互联网搜索之后,我发现它可以占用一些时间,因为它正在将数据块迁移到其他服务器。所以我决定等待它,因为我不想打断它,最后在一个生产实例中有2TB的数据被破坏。
现在12个小时过去了,我开始担心发生了什么。它仍然处于“5中的第2步”,处理器负载非常高,但它似乎仍然在移动块,并产生新的migrateThread操作以及许多“查询不记录(太大)”的插入。
下面是我的currentOp()日志的一部分:
{
"desc" : "migrateThread",
"threadId" : "139962853246720",
"active" : true,
"opid" : -2003494368,
"secs_running" : 408,
"microsecs_running" : NumberLong(408914923),
"op" : "none",
"ns" : "data.logs",
"query" : {
},
"msg" : "step 2 of 5",
"numYields" : 0,
"locks" : {
"Global" : "w",
"Database" : "w",
"Collection" : "w"
},
"waitingForLock" : false,
"lockStats" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(37984),
"w" : NumberLong(37982)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(1),
"w" : NumberLong(37981),
"W" : NumberLong(1)
},
"acquireWaitCount" : {
"W" : NumberLong(1)
},
"timeAcquiringMicros" : {
"W" : NumberLong(1446)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(1),
"w" : NumberLong(37980),
"W" : NumberLong(1)
},
"acquireWaitCount" : {
"W" : NumberLong(1)
},
"timeAcquiringMicros" : {
"W" : NumberLong(3224)
}
}
}
},
{
"desc" : "conn451221",
"threadId" : "139962959451904",
"connectionId" : 451221,
"client" : "10.0.0.111:57408",
"active" : true,
"opid" : -2003439364,
"secs_running" : 0,
"microsecs_running" : NumberLong(37333),
"op" : "insert",
"ns" : "data.logs",
"query" : {
"$msg" : "query not recording (too large)"
},
"numYields" : 0,
"locks" : {
"Global" : "w",
"Database" : "w",
"Collection" : "w"
},
"waitingForLock" : false,
"lockStats" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(1),
"w" : NumberLong(1)
}
},
"Database" : {
"acquireCount" : {
"w" : NumberLong(1)
}
},
"Collection" : {
"acquireCount" : {
"w" : NumberLong(1)
}
}
}
},当我检查mongod.log时,我看到以下内容:
2017-05-04T19:08:14.203Z I SHARDING [migrateThread] starting receiving-end of migration of chunk { _id: -8858253000066304220 } -> { _id: -8857450400323294366 } for collection data.logs from mongo03:27017 at epoch 56f5410efed7ec477fb62e31
2017-05-04T19:08:14.350Z I SHARDING [migrateThread] Deleter starting delete for: data.logs from { _id: -8858253000066304220 } -> { _id: -8857450400323294366 }, with opId: 2291391315
2017-05-04T19:08:14.350Z I SHARDING [migrateThread] rangeDeleter deleted 0 documents for data.logs from { _id: -8858253000066304220 } -> { _id: -8857450400323294366 }
2017-05-04T19:18:26.625Z I SHARDING [migrateThread] Waiting for replication to catch up before entering critical section
2017-05-04T19:18:26.625Z I SHARDING [migrateThread] migrate commit succeeded flushing to secondaries for 'data.logs' { _id: -8858253000066304220 } -> { _id: -8857450400323294366 }
2017-05-04T19:18:36.499Z I SHARDING [migrateThread] migrate commit succeeded flushing to secondaries for 'data.logs' { _id: -8858253000066304220 } -> { _id: -8857450400323294366 }
2017-05-04T19:18:36.788Z I SHARDING [migrateThread] about to log metadata event into changelog: { _id: "mongo01-2017-05-04T21:18:36.788+0200-590b7e8c1bc38fe0dd61db45", server: "mongo01", clientAddr: "", time: new Date(1493925516788), what: "moveChunk.to", ns: "data.logs", details: { min: { _id: -8858253000066304220 }, max: { _id: -8857450400323294366 }, step 1 of 5: 146, step 2 of 5: 279, step 3 of 5: 611994, step 4 of 5: 0, step 5 of 5: 10162, note: "success" } }
2017-05-04T19:19:04.059Z I SHARDING [migrateThread] starting receiving-end of migration of chunk { _id: -9090190725188397877 } -> { _id: -9088854275798899737 } for collection data.logs from mongo04:27017 at epoch 56f5410efed7ec477fb62e31
2017-05-04T19:19:04.063Z I SHARDING [migrateThread] Deleter starting delete for: data.logs from { _id: -9090190725188397877 } -> { _id: -9088854275798899737 }, with opId: 2291472928
2017-05-04T19:19:04.064Z I SHARDING [migrateThread] rangeDeleter deleted 0 documents for data.logs from { _id: -9090190725188397877 } -> { _id: -9088854275798899737 }
2017-05-04T19:28:16.709Z I SHARDING [migrateThread] Waiting for replication to catch up before entering critical section
2017-05-04T19:28:16.709Z I SHARDING [migrateThread] migrate commit succeeded flushing to secondaries for 'data.logs' { _id: -9090190725188397877 } -> { _id: -9088854275798899737 }
2017-05-04T19:28:17.778Z I SHARDING [migrateThread] migrate commit succeeded flushing to secondaries for 'data.logs' { _id: -9090190725188397877 } -> { _id: -9088854275798899737 }
2017-05-04T19:28:17.778Z I SHARDING [migrateThread] about to log metadata event into changelog: { _id: "mongo01-2017-05-04T21:28:17.778+0200-590b80d11bc38fe0dd61db46", server: "mongo01", clientAddr: "", time: new Date(1493926097778), what: "moveChunk.to", ns: "data.logs", details: { min: { _id: -9090190725188397877 }, max: { _id: -9088854275798899737 }, step 1 of 5: 3, step 2 of 5: 4, step 3 of 5: 552641, step 4 of 5: 0, step 5 of 5: 1068, note: "success" } }
2017-05-04T19:28:34.889Z I SHARDING [migrateThread] starting receiving-end of migration of chunk { _id: -8696921045434215002 } -> { _id: -8696381531400161154 } for collection data.logs from mongo06:27017 at epoch 56f5410efed7ec477fb62e31
2017-05-04T19:28:35.134Z I SHARDING [migrateThread] Deleter starting delete for: data.logs from { _id: -8696921045434215002 } -> { _id: -8696381531400161154 }, with opId: 2291544986
2017-05-04T19:28:35.134Z I SHARDING [migrateThread] rangeDeleter deleted 0 documents for data.logs from { _id: -8696921045434215002 } -> { _id: -8696381531400161154 }因此,迁移数据需要很长时间。我该担心什么吗?我该采取什么行动,还是把它置之不理,等着吧?
我要说的是,我自己没有开始任何迁移。这一切都是自己发生的。所以我有点困惑。
请帮帮我!
发布于 2017-05-05 10:01:59
它自己解决了,只是等了很长时间。其他服务器是从"RangeDeleter“操作开始的,现在看来一切都很好。
https://stackoverflow.com/questions/43791482
复制相似问题