我已经配置了monit来检查我的IRCd及其服务是否正在运行。最近,运行所有这些的实例重新启动,并且没有完成它的工作。
它被配置为在启动时启动。
[root@ip-172-31-21-162 ec2-user]# chkconfig --list monit
monit 0:off 1:off 2:on 3:on 4:on 5:on 6:off控制文件
[root@ip-172-31-21-162 ec2-user]# cat /etc/monit.conf
set httpd port 2812
allow 127.0.0.1
set daemon 60
include /etc/monit.d/*
check process ircd with pidfile /home/ec2-user/inspircd/run/pid
start program = "/usr/bin/perl /home/ec2-user/inspircd/run/inspircd start"
as uid "ec2-user" and gid "ec2-user"
with timeout 30 seconds
check process services with pidfile /home/ec2-user/anope/run/data/services.pid
depends on ircd
start program = "/bin/sh /home/ec2-user/anope/run/bin/anoperc start"
as uid "ec2-user" and gid "ec2-user"
with timeout 30 seconds根据文档,这个语法看起来很好..。
<START | STOP | RESTART> [PROGRAM] = "program"
[[AS] UID <number | string>]
[[AS] GID <number | string>]
[[WITH] TIMEOUT <number> SECOND(S)]做一次检查也是一样的
[ec2-user@ip-172-31-29-142 ~]$ sudo monit -t
Control file syntax OK但是,日志显示没有为这些被监视的进程定义启动方法!
[UTC May 14 04:39:51] error : 'ircd' process is not running
[UTC May 14 04:39:51] error : monit: Start or stop method not defined -- process ircd
[UTC May 14 04:39:51] error : 'services' process is not running
[UTC May 14 04:39:51] error : monit: Start or stop method not defined -- process services通过monit手动启动进程是出于某种原因。
[root@ip-172-31-21-162 ec2-user]# monit start ircd
[root@ip-172-31-21-162 ec2-user]# monit status
The Monit daemon 5.2.5 uptime: 7h 14m
Process 'ircd'
status running
monitoring status monitored
pid 26483
parent pid 1
uptime 3m
...
data collected Sat May 14 02:49:57 2016
Process 'services'
status running
monitoring status monitored
pid 26488
parent pid 1
uptime 3m
...
data collected Sat May 14 02:49:57 2016这很奇怪。当我停止那些检查的进程并在启用了调试日志记录的情况下重新启动monit时,我看到它报告了start程序。
Process Name = ircd
Pid file = /home/ec2-user/inspircd/run/pid
Monitoring mode = active
Start program = '/home/ec2-user/inspircd/run/inspircd start' as uid 500 as gid 500 timeout 30 second(s)
Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Pid = if changed 1 times within 1 cycle(s) then alert
Ppid = if changed 1 times within 1 cycle(s) then alert
Process Name = services
Pid file = /home/ec2-user/anope/run/data/services.pid
Monitoring mode = active
Start program = '/home/ec2-user/anope/run/bin/anoperc start' as uid 500 as gid 500 timeout 30 second(s)
Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Depends on Service = ircd
Pid = if changed 1 times within 1 cycle(s) then alert
Ppid = if changed 1 times within 1 cycle(s) then alert你知道以格洛布的名义这里发生了什么吗?
发布于 2016-05-14 05:36:13
根据monit的记录行为,还必须定义停止方法,以便正确启动未运行的进程。
在活动模式(默认模式)中,Monit将主动监视服务,如果出现问题,将引发警报和/或重新启动服务。
-- Monit文档;服务方法
当进程不运行时由Monit执行的操作总是“重新启动”,但是由于没有独立的“重新启动程序”(直到Monit5.7),所以使用了stop+start序列。
因此,解决方案是并曾经将stop program行添加到控制文件中的检查进程。显然,如果您运行的是>=5.7版本,则可以选择使用restart program。
https://stackoverflow.com/questions/37223022
复制相似问题