Linux下通过Apache日志监测百度蜘蛛的爬行记录

高级搜索|网站地图|TAG标签 RSS订阅[设为首页] [加入收藏]

当前位置: > Linux服务器 > apache >

Linux下通过Apache日志监测百度蜘蛛的爬行记录

时间:2015-01-10 13:01来源:linux.it.net.cn 作者:IT

1. 百度蜘蛛爬行的次数
cat access.log | grep Baiduspider+ | wc
最左面的数值显示的就是爬行次数。

2. 百度蜘蛛的详细记录(Ctrl+C可以终止)
cat access.log | grep Baiduspider+
也可以用下面的命令：
cat access.log | grep Baiduspider+ | tail -n 10
cat access.log | grep Baiduspider+ | head -n 10
只看最后10条或最前10条，这用就能知道这个日志文件的开始记录的时间和日期。

3. 百度蜘蛛抓取首页的详细记录
cat access.log | grep Baiduspider+ | grep “GET / HTTP”

4. 百度蜘蛛派性记录时间点分布
cat access.log | grep “Baiduspider+” | awk ‘{print $4}’

5. 百度蜘蛛爬行页面按次数降序列表
cat access.log | grep “Baiduspider+” | awk ‘{print $7}’ | sort | uniq -c | sort -r

(责任编辑：IT)

------分隔线----------------------------

上一篇：Apache屏蔽YisouSpider(一搜蜘蛛\神马)的方法
下一篇：apache日志应用设置

栏目列表

推荐内容

热点内容