在过去的几天里,我开始看到这个问题。节段性断层( SIGSEGV )在启动后5分钟内终止。
这是稳定的,因为最近几个months..so不确定什么改变。
Version - gmetad 3.7.1在/var/log/messages或/var/log/中,我没有看到任何核心转储或特定于gmetad的任何内容。
此事件发生时(从顶部)系统快照
load average: 1.97, 0.99, 0.42内存看起来也很好
free -m
total used free shared buffers cached
Mem: 7989 3624 4364 0 333 2562
-/+ buffers/cache: 728 7260
Swap: 4095 0 4095我有一个超能力的程序,它可以观察和观察gmetad -
这是主管的日志
2016-10-20 14:34:55,707 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:34:55,707 INFO received SIGCLD indicating a child quit
2016-10-20 14:34:57,712 INFO spawned: 'gmetad' with pid 24561
2016-10-20 14:34:59,929 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:34:59,929 INFO received SIGCLD indicating a child quit
2016-10-20 14:35:02,932 INFO spawned: 'gmetad' with pid 24593
2016-10-20 14:35:04,897 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:35:04,897 INFO received SIGCLD indicating a child quit
2016-10-20 14:35:08,903 INFO spawned: 'gmetad' with pid 24618
2016-10-20 14:35:11,257 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:35:11,257 INFO received SIGCLD indicating a child quit
2016-10-20 14:35:12,257 INFO gave up: gmetad entered FATAL state, too many start retries too quickly有谁特别遇到过gmetad这样的问题吗?感谢你的指点。
发布于 2016-10-20 21:14:57
我找到了问题并解决了。
一些关键步骤/调查结果-
在我的例子中,为了指出文件名- 'part_max_used.rrd‘是/path/ to /ganglia/rrds/node_name下的文件名,是SIGSEGV的根本原因。
希望这有帮助-)
https://stackoverflow.com/questions/40162219
复制相似问题