Parsing timestamps from Apache log files in Python
Timestamps in Apache access log files have, by default, the format
27/Oct/2013:06:33:40 +0100
This can not be parsed in Python using the strptime() function from the time/datetime modules because there is no %z placeholder in strptime() to match the timezone (only %Z). Also, using the parse() function from dateutil.parser does not work, because it fails to recognize the format and it is non-trivial to give a simple format string.
After looking for the "best" solution now for quite a while, here is probably the most elegant way to do it:
>>> from dateutil.parser import parse
>>> d = '27/Oct/2013:06:33:40 +0100'
>>> parse(d[:11] + " " + d[12:])
datetime.datetime(2013, 10, 27, 6, 33, 40,
tzinfo=tzoffset(None, 3600))