当前位置: > Linux服务器 > 监控工具 >

python编写的check_log插件检查日志关键词for nagios

时间:2014-10-26 01:05来源:linux.it.net.cn 作者:it

python编写的check_log插件检查日志关键词for nagios,供大家学习参考。

基于 python-2.5.2 版本写的一个小插件,巡检日志里的关键词,可以设置阀值,超过阀值报警!
开始写的时候没有太多注意,结果处理一个大的日志文件脚本执行时间有点长,看了下日志有 70M,想了一下的确啊。
如果一次读取么多字符串处理起来肯定会漫长!随后在python.org 上看到了MMAP,这个模块,看到这个一下让我想起,以前编译lustre的时候,添加这个小模块时候的作用,虚拟文件。大大提升了读取大文件内容的速度!

下面看脚本,实现的功能为:检测nginx日志有多少个 No such file or directory
nagios $> cat check_log
 

复制代码代码如下:

# -*- coding: utf-8 -*-
#!/usr/bin/python
# QQ:277057817
import mmap
import os
import sys
import getopt

def usage():
    print """
    check_log is a Nagios monitor logs Script

    Usage:

    check_log [-h|--help][-l|--log][-s|--string][-w|warning][-c|critical]

    Options:
           --help|-h)
                 check_log help.
           --log|-l)
                 sets log file path.
           --string|-s)
                 sets monitor Keywords.
           --warning|-w)
                 sets Keywords quantity.Default is: off
           --critical|-c)
                 sets Keywords quantity.Default is: off
     example:
            ./check_log -l /var/log/nginx.log -s "502 Bad Gateway" -w 5 -c 10 """
    sys.exit(3)

try:
   options,args = getopt.getopt(sys.argv[1:],"hl:s:w:c:",["--help","log=","string=","warning=","critical="])
except getopt.GetoptError:
   usage()

for n,v in options:
    if n in ("-h","--help"):
       usage()
    if n in ("-l","--log"):
       log = v
    if n in ("-s","--string"):
       string = v
    if n in ("-w","--warning"):
       warning = v
    if n in ("-c","--critical"):
       critical = v

if 'log' in dir() and 'string' in dir():
   try:
      file = open(log,"r+")
      size = os.path.getsize(log)
      data = mmap.mmap(file.fileno(),size)
     # 用了mmap模块的功能!
      text = data.read(-1)
      counts = text.count(string)
      counts = str(counts)
      data.close()
      file.close()
   except IOError:
      print "No such file or directory:"+log
else:
   usage()

if 'warning' in dir() and 'critical' in dir():
   if warning < critical:
      if counts >= warning and counts < critical:
         print 'WARNING - %s views %s' % (string,counts)
         sys.exit(2)
      elif counts >= critical:
         print 'CRITICAL - %s views %s' % (string,counts)
         sys.exit(1)
      else:
         print 'OK - %s views %s' % (string,counts)
         sys.exit(0)
   else:
     print "Must critical > warning"
     sys.exit(0)
else:
    print 'OK - %s views %s' % (string,counts)
    sys.exit(0)

调用示例:
 

复制代码代码如下:
nagios $> python check_log -l /usr/local/nginx/logs/error.log -s "No such file or directory" -w 2 -c 3
CRITICAL - No such file or directory views 3

添加到 commands.cfg:
nagios $> vim /usr/local/nagios/etc/objects/commands.cfg 
 

复制代码代码如下:
define command{
        command_name check_log
        command_line $USER1$/check_log -l $ARG1$ -s $ARG2$ -w  $ARG3$ -c  $ARG4$
}

然后在客户端 nrpe.cfg 添加:
nginx $> vim /usr/local/nagios/etc/nrpe.cfg
 

复制代码代码如下:
command[check_log]=/usr/local/nagios/libexec/check_log -l /var/logs/nginx.log -s "No such file or directory" -w 2 -c 5

然后在服务端的 services.cfg 添加:
nagios $> vim /usr/local/nagios/etc/tuokou/services.cfg
 

复制代码代码如下:
define service{
    use                     local-service,srv-pnp
    host_name               web
    service_description     nginx log
    check_command         check_nrpe!check_log
}
(责任编辑:IT)
------分隔线----------------------------