Wednesday, July 18, 2007

Musings on 100% Log Collection

Warning: this is NOT about ROI! :-)

One of the most exciting, complicated and at the same time very common questions from the field of log management is the "what logs to collect?" question (this, BTW, implies that logs not collected will be left to rot wherever they were generated and thus might or might not be available at the time of dire need. You are collecting logs, aren't you?). This comes up during compliance-driven log management projects (in the form of "what to collect for PCI DSS compliance?") as well as operationally-driven (in the form of "what logs from this application do I need to detect faults and errors?") or security-driven log management projects (in the form of "which logs will help me during the incident response?")

What are the answers that one sometimes hear? Otherwise awesome log management guidance NIST 800-92 "Guide to Computer Security Log Management" confuses the reader with this fascinating blurb: "generally, organizations should only require logging and analyzing the data that is of greatest importance." And how do people to know which logs are of importance? (I did have a bit of a debate with NIST folks on that...)

Other answers are situation-specific and thus limited in their usefulness ("need IDS alerts + server logs to detect intrusions via correlation", "need all logs that show access to PHI"). I spoke about the pitfalls of "prioritizing before collection" in my presentation "Six Mistakes of Log Management" and its companion paper (TBA). In some cases, such as the incident response scenario, you might be naturally leaning towards grabbing as much as possible, since you never know which bit will help you answer that dreaded "WTF happened?!" question ...

On the other hand, there is a simple answer that doesn't suffer from the above issues: collect everything. However, many folks go into a state of shock upon hearing it :-) "Everything!!!??? HOW can you collect 'everything''? What about storage, bandwidth, hardware, etc?"

But you know what? It really isn't as bad as you think! Seriously! Just think that:

  1. Logs compress really well (1:10 to 1:15 compression ratios are not unheard of), so bandwidth and storage are less of an issue that initially estimated
  2. Disk storage is cheap (and tape is cheaper still); holding a billion of log records might well cost under $200 (!)
  3. Figuring out what you need, might need, can be "told to need", will need, etc is genuinely hard. You will never get it right!
  4. Many logs don't need to get collected in real-time after generation, thus allowing you to save some bandwidth by moving them when network is less busy.
  5. Technologies exist to make sense of "everything", not just "hand-picked" and parsed logs.

Convinced yet? So, if you are pondering "what logs to collect?", try to switch your mindset into thinking "what will it take for me to collect everything?" You probably won't regret this decision!

Related posts:

Dr Anton Chuvakin