当前位置: > Linux教程 > linux基础 >

比较awk和sed读取并提取文档内容的速度

时间:2015-01-10 20:18来源:linux.it.net.cn 作者:IT
个简单的脚本来读取网站访问ip的文章,其实从日志文件中提取ip这个活,用awk做更简单,用sed稍微有点难度,这里来比较下用awk和sed读取那个更快,下面是日志的格式
[root@279155 wwwlogs]# tail -f www.itnetcn.com.log
 
 

 
203.95.5.81 - - [27/Mar/2012:13:42:23 +0800] GET /favicon.ico HTTP/1.1 200 0 - Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218 Safari/535.1 -
203.95.5.81 - - [27/Mar/2012:13:42:23 +0800] GET /wp-content/themes/hotnewspro27/ HTTP/1.1 200 167 - Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218 Safari/535.1 -
203.95.5.81 - - [27/Mar/2012:13:42:23 +0800] GET /favicon.ico HTTP/1.1 200 0 - Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218 Safari/535.1 -
173.212.220.224 - - [27/Mar/2012:13:43:22 +0800] POST /wp-cron.php?doing_wp_cron=1332827002 HTTP/1.0 200 0 - WordPress/3.3.1; http://www.itnetcn.com -
203.95.5.81 - - [27/Mar/2012:13:43:23 +0800] POST /wp-admin/admin-ajax.php HTTP/1.1 200 253 http://www.pyshell.com/wp-admin/post-new.php Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218 Safari/535.1 -
66.249.68.206 - - [27/Mar/2012:13:43:23 +0800] GET /sitemap.xml.gz HTTP/1.1 200 1279 - Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) -
 
 用awk来做的话更简单,下面来看看运行的时间:
 
 

 
[root@it wwwlogs]# time awk '{print $1}' www.itnetcn.com.log | sort | uniq -c
    128 101.84.69.207
     348 66.249.66.26
     38 66.249.67.201
    137 66.249.68.206
      6 66.249.71.166
    198 66.249.71.216
      1 66.249.72.219
    778 66.249.72.47
     。。。。
      1 75.101.233.99
     18 77.222.128.221
      4 78.46.77.21
    120 80.243.181.34
     97 81.169.181.179
      1 89.189.191.11
      1 91.83.62.212
 
real    0m0.170s
user    0m0.140s
sys     0m0.016s
 
[root@it wwwlogs]#
   下面看看sed的运行时间:
 
 

 
[root@it wwwlogs]#  time sed -r "s/([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+).*/\n\1/;s/^.*\n//" www.itnetcn.com.log | sort | uniq -c
     10 101.226.33.201
     10 101.226.33.222
     10 101.226.33.239
      7 101.226.66.179
    128 101.84.69.207
      1 108.171.241.230
     。。。。。。。
    120 80.243.181.34
     97 81.169.181.179
      1 89.189.191.11
      1 91.83.62.212
 
real    0m2.683s
user    0m2.640s
sys     0m0.025s
[root@279155 wwwlogs]#
  
    比较下来,似乎awk更快,呵呵。。。

(责任编辑:IT)
------分隔线----------------------------
栏目列表
推荐内容