我正试图监视nagios的gearman,因为我使用的是脚本check_gearman.sh。
本地主机是运行gearman服务器的地方。
当我跑的时候
./check_gearman.sh -H localhost -p 4730 -t 1000其结果是:
关键: gearman: gearman_client_run_tasks : gearman_wait(GEARMAN_TIMEOUT)超时,1台服务器被轮询(),没有可用的服务器,管道:false -> libgearman/universal.cc:331: pid(613)
有人能帮我一下吗?
下面是脚本
#!/bin/sh
#
# gearman check for nagios
# written by Georg Thoma (georg@thoma.cn)
# Last modified: 07-04-2014
#
# Description:
#
#
#
PROGNAME=`/usr/bin/basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION="0.04"
export TIMEFORMAT="%R"
. $PROGPATH/utils.sh
# Defaults
hostname=localhost
port=4730
timeout=50
# search for gearmanstuff
GEARMAN_BIN=`which gearman 2>&1 | grep -v "no gearman in"`
if [ "x$GEARMAN_BIN" == "x" ] ; then # result of check is empty
echo "gearman executable not found in path"
exit $STATE_UNKNOWN
fi
GEARADMIN_BIN=`which gearadmin 2>&1 | grep -v "no gearadmin in"`
if [ "x$GEARADMIN_BIN" == "x" ] ; then # result of check is empty
echo "gearadmin executable not found in path"
exit $STATE_UNKNOWN
fi
print_usage() {
echo "Usage: $PROGNAME [-H hostname -p port -t timeout]"
echo "Usage: $PROGNAME --help"
echo "Usage: $PROGNAME --version"
}
print_help() {
print_revision $PROGNAME $REVISION
echo ""
print_usage
echo ""
echo "gearman check plugin for nagios"
echo ""
support
}
# Make sure the correct number of command line
# arguments have been supplied
if [ $# -lt 1 ]; then
print_usage
exit $STATE_UNKNOWN
fi
# Grab the command line arguments
exitstatus=$STATE_WARNING #default
while test -n "$1"; do
case "$1" in
--help)
print_help
exit $STATE_OK
;;
-h)
print_help
exit $STATE_OK
;;
--version)
print_revision $PROGNAME $REVISION
exit $STATE_OK
;;
-V)
print_revision $PROGNAME $REVISION
exit $STATE_OK
;;
-H)
hostname=$2
shift
;;
--hostname)
hostname=$2
shift
;;
-t)
timeout=$2
shift
;;
--timeout)
timeout=$2
shift
;;
-p)
port=$2
shift
;;
--port)
port=$2
shift
;;
*)
echo "Unknown argument: $1"
print_usage
exit $STATE_UNKNOWN
;;
esac
shift
done
# check if server is running and replys to version query
VERSION_RESULT=`$GEARADMIN_BIN -h $hostname -p $port --server-version 2>&1 `
if [ "x$VERSION_RESULT" == "x" ] ; then # result of check is empty
echo "CRITICAL: Server is not running / responding"
exitstatus=$STATE_CRITICAL
exit $exitstatus
fi
# drop funtion echo to remove functions without workers
DROP_RESULT=`$GEARADMIN_BIN -h $hostname -p $port --drop-function echo_for_nagios 2>&1 `
# check for worker echo_for_nagios and start a new one if needed
CHECKWORKER_RESULT=`$GEARADMIN_BIN -h $hostname -p $port --status | grep echo_for_nagios`
if [ "x$CHECKWORKER_RESULT" == "x" ] ; then # result of check is empty
nohup $GEARMAN_BIN -h $hostname -p $port -w -f echo_for_nagios -- echo echo >/dev/null 2>&1 &
fi
# check the time to get the status from gearmanserver
CHECKWORKER_TIME=$( { time $GEARADMIN_BIN -h $hostname --status ; } 2>&1 |tail -1 )
# check if worker returns "echo"
CHECK_RESULT=`cat /dev/null | $GEARMAN_BIN -h $hostname -p $port -t $timeout -f echo_for_nagios 2>&1`
# validate result and set message and exitstatus
if [ "$CHECK_RESULT" = "echo" ] ; then # we got echo back
echo "OK: got an echo back from gearman server version: $VERSION_RESULT, responded in $CHECKWORKER_TIME sec|time=$CHECKWORKER_TIME;;;"
exitstatus=$STATE_OK
else # timeout reached, no echo
echo "CRITICAL: $CHECK_RESULT"
exitstatus=$STATE_CRITICAL
fi
exit $exitstatus提前谢谢。
发布于 2015-05-19 01:49:43
如果您下载mod_gearman包,它包含一个更好、功能更丰富的check_gearman插件,用于Nagios。
使用当前的插件,错误消息显示检查脚本无法连接到gearman守护进程。
您应该验证端口4370在本地主机上侦听,并且没有本地防火墙阻塞连接。很可能您已经在不同的端口上安装了gearmand,或者让它只在网络接口上监听,而不是在本地主机上。或者它根本没有在运行,或者在与运行检查的服务器上.
https://stackoverflow.com/questions/27148126
复制相似问题