Importing Apache log file with performance information into OpenOffice / Excel using Python script

You can enhance Apache log files to include performance information, like read/sent bytes and HTTP request processing time. Compared to load tool (JMeter) generated data this information is usually easier to acquire and reflects the actual production server usage patterns. I think there even exist tools to replay the log data to rerun the load in simulation mode. This, however, works only on systems whose state can be reseted to the original situation (like with ZODB transactional database).

Below is an example Python script snippet which reads this custom Apache log file and creates Excel export out of it. After this, you can easily open this file in OpenOffice Calc or Microsoft Excel and draw nice diagrames out of it to see where the performance bottlenecks are.

The script splits duration column for the output file. You can easily add your own splits or use Calc’s Text to Columns editor.

The script:


import shlex
from pyExcelerator.Workbook import Workbook

FILE = "/var/log/apache2/yourlogfile.log"

def get_parts(line):
	""" Extract the parts of a line"""

	parts = shlex.split(line) # Split reserving parts inside quotation marks
	# 'duration:0/42327', 'io:643/4507', 'balancer:']

	return parts

def get_duration(duration):
	""" Extract duration in ms from log line """

	keyword, args = duration.split(":")
	seconds, microseconds = args.split("/")

	mseconds = int(microseconds) 

	return mseconds

wb = Workbook()
sheet = wb.add_sheet('0')

print "Reading log"
i = 0

for line in open(FILE, "rt"):
	parts = get_parts(line)
	j = 0

	for part in parts:
		j += 1
	i += 1'output.xls')

\"\" Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

One thought on “Importing Apache log file with performance information into OpenOffice / Excel using Python script

  1. Pingback: Weekly Digest for January 27th | William Stearns

Leave a Reply

Your email address will not be published. Required fields are marked *