分析apache的日志,得到蜘蛛爬行记录

[代码] 记得自己修改日志路径,有很多bug请自行修改

#!/usr/bin/env bash

LANG=en_US.UTF-8

if [ -n "$2" ]; then

logpath=~/logs/$1/http/access.log.$2

else

logpath=~/logs/$1/http/access.log

if [ -n "$1" ]; then

for i in baidu Sogou Googlebot yahoo bingbot YandexBot YoudaoBot ; do

spider=`cat $logpath | grep -E -v 'jpg|gif|png|js|css' | grep -E $i | wc -l`

echo "$i Spider:$spider"

done

topip=`cat $logpath | grep -E -v 'jpg|gif|png|js|css' | awk '$1 {print $1,$12,$13,$14,$15,$16,$17,$18,$19}' | sort | uniq -c | sort -rn | head -n 10 | awk '{printf "\\n%-8s %-15s %s %s %s %s %s %s %s",$1,$2,$3,$4,$5,$6,$7,$8,$9}'`

echo "TOP10 IP:$topip"

else

echo "Using #./spider_log hosting(20hotel.com) [`date +%F`]."

(责任编辑：IT)

搜索

热门标签:

分析apache的日志,得到蜘蛛爬行记录

[代码] 记得自己修改日志路径,有很多bug请自行修改