Sunday, March 23, 2008

Say When - Trusting Log Timestamps

Am I a  leading visionary in the field of log management? :-) Who cares - I will now pontificate as if I am :-) It is about time: specifically, timing logs. As I said in my Log Trust and Protecting Logs from Admins posts, the issue of trust is critical in the logging world. After all, logs = accountability; and the latter in unthinkable without trust. If we are to at least pretend that logs objectively record events and user actions, we need to unambiguously establish WHAT happened and WHEN. This post deals with the 'WHEN'  issue.

So, can we trust that the time stamp in the log file or the one added by the log management system correctly describes when the event actually happened?

We will start from locating the timestamps in logs. Most of the log formats, such as file-based logs (web, application, some security gear, etc) and syslog, Windows event logs, database audit tables, proprietary ones, contain a timestamp. In fact, once I saw somebody use a timestamp to define logs as "timed records of IT activity." So, time is critical for logs being, well, logs :-) At this point it is worthwhile to note that file-based logs will contain a timestamp IN the file, while syslog records arriving over the UDP or TCP port 514 connection are usually timestamped upon arrival BY the syslog daemon (using its own "knowledge" of time) - and then it shows up in the syslog files in  /var/log.

Let's assess whether this "in-log timestamp" provides an adequate way of timing the actual event that is being logged. Answering this question is important for investigations and troubleshooting, but becomes  nearly a matter of life and death in case of log forensics.

Here are some fun cases and issues to consider:

First, what are the chances of a  completely false timestamp in logs (BTW, today is Jan 1, 1970!) When might that happen? Typically when a logging system own clock is reset or not set correctly. This timestamp clearly should NOT be trusted. 

Second, we can say that it’s always 5PM somewhere: in other words, what time zone are your logs in? EST? PDT? GMT? UTC? Or any of more than 24 other possibilities. If you have no idea, you should not trust the timestamp.

Third, are you in drift? Is your system clock? Those pesky drift seconds turn into minutes which then work to undermine the accuracy of timing the records (and thus your certainly and trust in evidence quality)

Fourth, syslog forwarder mysteries are plenty: some of the syslog messages will be delayed in transit and the  be timestamped by the final recipient daemon, thus completely losing when the event was originally logged. Admittedly, this delayed syslog is rare, but as more people employ buffering syslog daemons (e.g. syslog-ng), it might happen more often.

Fifth, more esoteric, but still real (and really annoying): some system logs will contain two timestamps. If you don't possess in-depth knowledge of this specific log, confusion has a chance to cut the trust as well (so, which timestamp should I use?)

Sixth, most people will not think that they will fall to something that stupid: 24 vs 12 hour time. However, when facing an unknown (and poorly designed!) log format, beware that 5:17 might well be 17:17...

Finally, if you know that something got logged at 5:17AM, then when did it happen? Beware of "Log lag!" This issues is actually to tricky to give it justice here... The simplest example is when the process leaves a log records when it exits not when it starts, possibly days earlier (thus creating a log lag).

As we dive into more issues with timing logs, we also need to think about sequence timing and absolute timing. Sequence of logged events is a critical fact! Miss the sequence and the whole “house of cards” goes …  But! Absolute time is also important! Can we be assured of both all the time? (hint: no)

So,  when you look at logs next time and you see a timestamp there - start thinking about all this :-)

Technorati tags: , ,

No comments:

Dr Anton Chuvakin