awk日志统计_awk统计日志文件数据

时间:2015-05-02 01:48 来源:linux.it.net.cn 作者:IT

awk日志统计的实例代码，使用awk命令对指定格式的日志文件做统计，统计日志之中，每个人得到每种资源的总和，统计每个小时内，每种资源被得到的数量，awk命令实例教程。

如下图：

对以上格式的日志文件做统计。
第一列为时间，第四列为事件，第六列为得到资源类型，第七列为得到资源数量。

1、统计日志之中，每个人得到每种资源的总和

	#!/bin/bash

	#

	LOG_PATH=~/tmplog

	GET_RES="GetResource"

	UID_LIST=`awk -F, '$4=="'"$GET_RES"'" {print $3}' $LOG_PATH | sort | uniq` 

	for id in $UID_LIST

	do

	   RES_LIST=`awk -F, '$3=="'"$id"'" && $4=="'"$GET_RES"'" {print $6}' $LOG_PATH | sort | uniq `

	#   echo $RES_LIST

	   for res in $RES_LIST

	   do

	      awk -F, '$3=="'"$id"'" && $4=="'"$GET_RES"'" && $6=="'"$res"'" {print substr($7,8)}' $LOG_PATH > ~/tmp

	      awk 'BEGIN{total=0} {total+=$1} END {printf "%s,%s,%d\n","'"$id"'",substr("'"$res"'",9),total}' ~/tmp

	   done

	done

	rm -rf tmp

	exit

改进后方法：

	#!/bin/bash

	#

	awk -F, '$4=="GetResource" {print}' ~/tmplog > tmp

	awk '

	BEGIN{

	FS=","

	}

	{

	   arr[$3,substr($6,9)]+=substr($7,8)

	}

	END{

	   for (i in arr){

	      split(i,idx,SUBSEP)

	      print idx[1],idx[2],arr[idx[1],idx[2]]

	   }

	}

	' tmp > output1

	sort -o output1 output1

	rm -rf tmp

	exit

2、统计每个小时内，每种资源被得到的数量。

	#!/bin/bash

	#

	awk -F, '$4=="GetResource" {print}' ~/tmplog > tmp

	awk '

	BEGIN{

	   FS=","

	}

	{

	   arr[substr($1,1,13),substr($6,9)]+=substr($7,8)

	}

	END{

	   for (i in arr){

	      split(i,idx,SUBSEP)

	      print idx[1],idx[2],arr[idx[1],idx[2]]

	   }

	}

	' tmp > output2

	rm -rf tmp

	exit

(责任编辑：IT)