As I mentioned before, I received a lot of fun questions from the audience during our "Log Management Thought Leadership Roundtable Webcast" (recording, some comments). Since they would be useful to my readers, I am answering some of them here (questions are anonymous and slightly rewritten for clarity):
Q1: When you mention "forensics", are you speaking in term of legal forensic terminology - or in terms of incident investigation?
A1: When I say "forensics", I usually mean it in the legal sense. I call other investigations simply "incident investigations;" forensics carries an extra burden of proof and seeks to establish facts, not just "good hunches."
Q2: Are there solutions that can handle 2-3 Terabytes of log data per minute?
A2: No. Easy, huh? :-) See this for a specific example. Well, let me take this back: theoretically, you can always use a vendor that can handle a lot of data (like LogLogic) AND that has an ability to run a distributed operation across many appliances. The catch? You will need a lot of the appliances since 2-3 TB/minute is about 90 millions of log messages/second (assuming an optimistic 200 bytes/message)
Q3: I have terabytes of log data but how can be analyzed all this data? Are there products that can process all this data and receive valuable information?
A3: Yes, but you need to ask one question first: analyze why (example reasons here)? To discover something "interesting" (my favorite reason)? To find some specific artifact that you need in the logs? Or for some other reason? Before anybody can answer a question about "are there tools to 'analyze this'?", you'd need to answer that dreaded "why" question.
Q4: We were told to log every access to every SQL database in our environment. Is this even feasible with the best products on the market?
A4: Yes, it is. However, one needs to be extra careful with this. Look at this post for options and ideas. It may turn out that logging every SELECT statement and then collecting those native database logs will not be the best approach (mostly for database performance reasons) and a dedicated tool will need to be used. Database built-in auditing are better used for selective auditing.
Q5: Once logs are captured, and centrally stored, who should be responsible for the management and review of those logs?
A5: Good question! Really, this is a very good question that a) is important to have answered and b) does not have an "accepted," standard answer. It also depends upon what logs are those; let's assume the most complex scenario of a diverse set of logs from networks, systems and applications. So, the choices are: security team (sometimes: CIRT i.e. incident response team), some dedicated team in IT that provides "log services" (uncommon option, but growing in popularity) or some unit in IT that is responsible for regulatory projects (if compliance driven). If your answer is nobody, then you will be in trouble :-) If you answer wrong, you might have to fight to access your own logs (example)
Q6: Most of the discussion so far is about how to get started. What about after the system is deployed? Products tend to focus on collection and not on action or response. Where are the tools heading in terms of usability, incident tracking, collaboration?
A6: That's a long story, really, and it is hard to provide a short answer to this. Yes, collection has been a focus of products in the last few years, but now we are at a point where analysis and various uses of the data will come to the forefront. At the very least, you should be able to run reports and searches on the logs that you collected.
Q7: Do vendors typically offer a template of which logs to collect based the desired use cases?
A7: They should, yes :-) In some cases what you have is a bit of a push-pull between a vendor and a customer: "Tell us what to do?" - "First, you tell us what you would like to accomplish?" - "No, really, you tell me what I should be looking to accomplish." - .... sometimes ad infinitum. Also, for some uses cases it is hard to come up with a credible list (see this discussion about PCI DSS here)
Q8: What are the biggest difficulties when the log management solution is going to to be integrated and deployed in an organization with a lot of different log sources?
A8: Political boundaries and "log ownership issues" (see some discussion here) If you need to submit a paper form in triplicate to add a line to /etc/syslog.conf and then send more forms when something doesn't work right and you need to troubleshoot it (a real story), everything becomes painfully slow and inefficient.