当前位置: > Linux服务器 > nginx >

shell脚本分析nginx日志方法详解

时间:2014-08-20 02:52来源:linux.it.net.cn 作者:it
本文介绍了shell脚本分析nginx日志文件的方法,分为两种情况,Nginx作为最前端的负载均衡器,以Nginx作为Web端,置于LVS后面,不同情况下的日志分析脚本,需要的朋友参考下。


以下shell脚本分为两种情况,第一种情况是Nginx作为最前端的负载均衡器,其集群架构为Nginx+Keepalived时。

1,shell文件log-nginx.sh
 

复制代码代码示例:

#!/bin/bash
# www.it.net.cn

if [$# -eq 0 ]; then
  echo "Error: please specify logfile."
  exit 0
else
  LOG=¥1
fi
if [ ! -f $1 ]; then
  echo "Sorry, sir, I can""t find this apache log file, pls try again!"
exit 0
fi
################################  
echo "Most of the ip:"
echo "-------------------------------------------"
awk ""{ print $1 }""$LOG| sort| uniq -c| sort -nr| head -10
echo
echo
###################   
echo "Most of the time:"
echo "--------------------------------------------"
awk ""{ print $4 }""$LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10
echo
echo
#######################  
echo "Most of the page:"
echo "--------------------------------------------"
awk ""{print $11}""$LOG| sed ""s/^.*\\(.cn*\\)\"/\\1/g""| sort| uniq -c| sort -rn| head -10
echo
echo
#####################3
echo "Most of the time / Most of the ip:"
echo "--------------------------------------------"
awk ""{ print $4 }""$LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog
for i in ""awk ""{ print $2 }"" timelog""
do
  num=""grep $i timelog| awk ""{ print $1 }""""
  echo "$i $num"
  ip=""grep $i $LOG| awk ""{ print $1}""| sort -n| uniq -c| sort -nr| head -10""
  echo "$ip"
  echo
done
rm -f timelog 

第二种情况是以Nginx作为Web端,置于LVS后面,这时要剔除掉LVS的IP地址,比如LVS服务器的公网IP地址(像203.93.236.141、203.93.236.145等)。
这样可以将第一种情况的脚本略微调整下:
 

复制代码代码示例:

#!/bin/bash
# www.it.net.cn

if [$# -eq 0 ]; then
  echo "Error: please specify logfile."
  exit 0
else
  cat$1| egrep -v '203.93.236.141|145' > LOG
fi
if [ ! -f$1 ]; then
  echo "Sorry, sir, I can't find this apache log file, pls try again!"
exit 0
fi
###################################################
echo "Most of the ip:"
echo "-------------------------------------------"
awk '{ print$1 }' LOG| sort| uniq -c| sort -nr| head -10
echo
echo
####################################################
echo "Most of the time:"
echo "--------------------------------------------"
awk '{ print$4 }' LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10
echo
echo
####################################################
echo "Most of the page:"
echo "--------------------------------------------"
awk '{print$11}' LOG| sed 's/^.*\\(.cn*\\)\"/\\1/g'| sort| uniq -c| sort -rn| head -10
echo
echo
####################################################
echo "Most of the time / Most of the ip:"
echo "--------------------------------------------"
awk '{ print$4 }' LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog
for i in 'awk '{ print$2 }' timelog'
do
  num='grep$i timelog| awk '{ print$1 }''
  echo "$i$num"
  ip='grep$i LOG| awk '{ print$1}'| sort -n| uniq -c| sort -nr| head -10'
  echo "$ip"
  echo
done
rm -f timelog   
#!/bin/bash
if [$# -eq 0 ]; then
  echo "Error: please specify logfile."
  exit 0
else
  cat$1| egrep -v '203.93.236.141|145' > LOG
fi
if [ ! -f$1 ]; then
  echo "Sorry, sir, I can't find this apache log file, pls try again!"
exit 0
fi
###################################################
echo "Most of the ip:"
echo "-------------------------------------------"
awk '{ print$1 }' LOG| sort| uniq -c| sort -nr| head -10
echo
echo
####################################################
echo "Most of the time:"
echo "--------------------------------------------"
awk '{ print$4 }' LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10
echo
echo
####################################################
echo "Most of the page:"
echo "--------------------------------------------"
awk '{print$11}' LOG| sed 's/^.*\\(.cn*\\)\"/\\1/g'| sort| uniq -c| sort -rn| head -10
echo
echo
####################################################
echo "Most of the time / Most of the ip:"
echo "--------------------------------------------"
awk '{ print$4 }' LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog
for i in 'awk '{ print$2 }' timelog'
do
  num='grep$i timelog| awk '{ print$1 }''
  echo "$i$num"
  ip='grep$i LOG| awk '{ print$1}'| sort -n| uniq -c| sort -nr| head -10'
  echo "$ip"
  echo
done
rm -f timelog

可以用此脚本分析文件名为www_tomcat_20110331.log的文件。[root@localhost 03]# sh counter_nginx.sh www_tomcat_20110331.log,大家可能比较关注脚本运行后的第一项和第二项结果,即访问网站最多的IP和哪个时间段IP访问比较多,如下所示:
 

Most of the ip:  
-------------------------------------------  
  5440 117.34.91.54  
 9 119.97.226.226  
 4 210.164.156.66  
 4 173.19.0.240  
 4 109.230.251.35  
 2 96.247.52.15  
 2 85.91.140.124  
 2 74.168.71.253  
 2 71.98.41.114  
 2 70.61.253.194  
Most of the time:  
--------------------------------------------  
 12 15:31  
 11 09:45  
 10 23:55  
 10 21:45  
 10 21:37  
 10 20:29  
 10 19:54  
 10 19:44  
 10 19:32  
 10 19:13

如果对日志的要求不高,可以直接通过awk命令和sed命令来分析linux日志(如果对perl熟练也可以用它来操作),还可以通过awstats来进行详细分析,后者尤其适合web服务器和邮件服务器。
另外,如果对日志有特殊需求的话,还可以架设专用的日志服务器来收集linux服务器日志。

(责任编辑:IT)
------分隔线----------------------------
栏目列表
推荐内容