Creating requests per time graph from nginx or apache access log
To create script that calculates values was simple part, but to create graph was a little bit tricky. But lets start from beginning…
Log file this script analyze look like this:
xx.xxx.xxx.xxx - - [20/Oct/2012:06:25:22 +0300] "GET ... HTTP/1.1" 200 80638 "..." "Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20100101 Firefox/16.0" xx.xx.xxx.xxx - - [20/Oct/2012:06:25:24 +0300] "GET ... HTTP/1.1" 200 80638 "..." "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)" xxx.xx.x.xx - - [20/Oct/2012:06:25:25 +0300] "GET ... HTTP/1.1" 200 81302 "..." "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E; InfoPath.1)" xx.xx.xx.xx - - [20/Oct/2012:06:25:25 +0300] "GET ... HTTP/1.1" 200 102001 "..." "Mozilla/5.0 (Windows NT 6.1; rv:16.0) Gecko/20100101 Firefox/16.0"
In order to calculate requests per time frame we need to have those time intervals. So we take first date from log file and change it to seconds since 1970-01-01 00:00:00 UTC and add 300s (5 mins).
FDATE=$(head -1 $FILE |awk '{print $4}'|sed -e 's/\[//'|sed -e 's/\//-/g'|sed -e 's/:/ /') FDATE_S=$(date -d "$FDATE" '+%s'); FDATE_T=$((FDATE_S + 300)); |
Next we read line from log file and check if the line is in this time interval. If so we add 1 to counter. If not then we need to output value for this interval and find next time interval.
while read line do LDATE=$(echo $line|awk '{print $4}'|sed -e 's/\[//'|sed -e 's/\//-/g'|sed -e 's/:/ /') LDATE_S=$(date -d "$LDATE" '+%s'); if (( LDATE_S < FDATE_T )); then COUNT=$((COUNT + 1)) else echo "$(date -d @"$((FDATE_T - 300))" '+%Y-%m-%d %H:%M:%S') $COUNT" >>$DATAFILE FDATE_T=$((FDATE_T + 300)) COUNT=1; fi done <$FILE |
When the script finishes to read log we will have data file which looks like this:
2012-10-19 10:50:27 14693 2012-10-19 10:55:27 12019 2012-10-19 11:00:27 11409 2012-10-19 11:05:27 12984 2012-10-19 11:10:27 12087 2012-10-19 11:15:27 11161
Now we need to create histogram from this data.
The hard part there is to understand how to write plot command to gnuplot.
plot "$DATAFILE" using 1:3 |
In data file we have 3 columns, date, time and requests, but to gnuplot we tell that it should use first one, and third. It automatically takes second one (this was hard to find and understand).
Complete script:
#!/bin/bash FILE=$1 FDATE=$(head -1 $FILE |awk '{print $4}'|sed -e 's/\[//'|sed -e 's/\//-/g'|sed -e 's/:/ /') FDATE_S=$(date -d "$FDATE" '+%s'); FDATE_T=$((FDATE_S + 300)); COUNT=0 DATAFILE=$(mktemp) RESULTFILE="result-"$(date -d "$FDATE" '+%Y-%m-%d')".png" while read line do LDATE=$(echo $line|awk '{print $4}'|sed -e 's/\[//'|sed -e 's/\//-/g'|sed -e 's/:/ /') LDATE_S=$(date -d "$LDATE" '+%s'); if (( LDATE_S < FDATE_T )); then COUNT=$((COUNT + 1)) else echo "$(date -d @"$((FDATE_T - 300))" '+%Y-%m-%d %H:%M:%S') $COUNT" >>$DATAFILE FDATE_T=$((FDATE_T + 300)) COUNT=1; fi done <$FILE gnuplot << EOF reset set xdata time set timefmt "%Y-%m-%d %H:%M:%S" set format x "%H:%M" set autoscale set ytics set grid y set auto y set term png truecolor set output "$RESULTFILE" set xlabel "Time" set ylabel "Request per 5min" set grid set boxwidth 0.95 relative set style fill transparent solid 0.5 noborder plot "$DATAFILE" using 1:3 w boxes lc rgb "green" notitle EOF rm -f $DATAFILE |
And graph it creates:
Leave a Reply
You must be logged in to post a comment.