Friday, December 10, 2010

Complete PCI DSS Log Review Procedures, Part 7

Once upon a time, I was retained to create a comprehensive PCI DSS-focused log review policies, procedures and practices for a large company. As I am preparing to handle more of such engagements (including ones not focused on PCI DSS, but covering other compliance or purely security log reviews), I decided to publish a heavily sanitized version of that log review guidance as a long blog post series, tagged “PCI_Log_Review.”  It is focused on PCI DSS, but based on generally useful log review practices that can be utilized by everybody and with any regulation or without any compliance flavor at all.
This is the seventh post in the long, long series (part 1, part 2, part 3all parts). A few tips on how you can use it in your organization can be found in Part 1. You can also retain me to customize or adapt it to your needs.
And so we continue with our Complete PCI DSS Log Review Procedures:
Building an Initial Baseline Using a Log Management Tool
To build a baseline using a log management tool perform the following:
1. Make sure that relevant logs from a PCI application are aggregated by the log management tools
2. Confirm that the tool can “understand” (parse, tokenize, etc) the messages and identify the “event ID” or message type of each log. For pure indexing tools, see the manual procedures presented in the next section.
3. Select a time period for an initial baseline: “90 days” or “all time” if logs have been collected for less than 90 days. In some cases, 7-30 days periods can be used.
4. Run a report that shows counts for each message type. This report indicates all the log types that are encountered over the 90 day period of system operation
5. Assuming that no breaches of card data have been discovered , we can accept the above report as a baseline for “routine operation”
6. An additional step should be performed while creating a baseline: even though we assume that no compromise of card data has taken place, there is a chance that some of the log messages recorded over the 90 day period triggered some kind of action or remediation. Such messages are referred to as “known bad” and should be marked as such.

Let’s go through a complete example of the above strategy.
1. Make sure that relevant logs from a PCI DSS application are aggregated by the available  log management tool
At this step, we look at the log management tools and verify that logs from PCI applications are aggregated. It can be accomplished by looking at report with all logging devices:
Timeframe: Jan 1, 2009 - Mar 31, 2009 (90 days)
Device Type Device Name Log Messages
Windows 2003 Winserver1 215762
Windows 2003 Winserver2 215756
SANITIZED1 SANITIZED1 53445
SANITIZED2 SANITIZED2 566
SANITIZED3 SANITIZED3 3334444
This would indicate that aggregation is performed as needed.
2. Confirm that the tool can “understand” (parse, tokenize, etc) the messages and identify the “event ID” or message type of each log
This step is accomplished by comparing the counts of messages in the tool (such as the above report that shows log message counts) to the raw message counts in the original logs.
3. Select a time period for an initial baseline: “90 days” or “all time” if logs have been collected for less than 90 days
In this example, we are selecting 90 days since logs are available.
4. Run a report that shows counts for each message type. For example, the report might look something like this:
Timeframe: Jan 1, 2009 - Mar 31, 2009 (90 days)
Event ID Event Description Count Average Count/day
1517 Registry failure 212 2.3
562 Login failed 200 2.2
563 Login succeeded 24 0.3
550 User credentials updated 12 0.1
This report indicates all the log types that are encountered over the 90 day period of system operation.
5. Assuming that no breaches of card data have been discovered , we can accept the above report as a baseline for “routine operation”
During the first review it logs, it might be necessary to investigate some of the logged events before we accept them as normal. The next step explains how this is done.
6. An additional step should be performed while creating a baseline: even though we assume that no compromise of card data has taken place, there is a chance that some of the log messages recorded over the 90 day period triggered some kind of action or remediation. Such messages are referred to as “known bad” and should be marked as such.
Some of the logs in our 90 day summary actually indicative of the problems and require an investigation
Event ID Event Description Count Average Count/day Routine or “bad”
1517 Registry failure 212 2.3
562 Login failed 200 2.2
563 Login succeeded 24 0.3
550 User credentials updated 12 0.1
666 Memory exhausted 1 N/A Action: restart system
In this report, we notice the last line, the log record with an event ID = 666 and event name “Memory exhausted” that only occurred once during the 90 day period. Such rarity of the event is at least interesting; the message description (“Memory exhausted”) might also indicate a potentially serious issue and thus needs to be investigated as described below in the investigative procedures.

Creating a baseline manually is possible, but more complicated.


To be continued.

Follow PCI_Log_Review to see all posts.


Possibly related posts:

Dr Anton Chuvakin