Tuesday, September 18, 2007

On Using SIEMs with Log Management

A fun piece which kinda highlights why some people deploy both SIEM (for correlation) and log management (for 100% log collection and analysis).

Interesting quotes: "Can I, algorithmically ahead of time, guarantee that the system will “think” about every event I want it to? With almost every single correlation methodology Ive seen - especially including [SIEM's] default methodology - the answer is a resounding “NO”." and also "This methodology failure means that you cannot go back and do formal analysis on an incident that has passed through [SIEM] without the original raw events and significant manual labor except by sheer luck"

Thus, you have a SIEM for selective real-time security analytics via correlation and risk scoring AND a log management to have a full archive of all logs for incident response and in-depth analysis.

UPDATE:
Jack posted a detailed clarification here (and in comments below); still, I'd say that if we live in the world of SELECT statements and limit ourselves to database / structured data analysis, we are by definition missing things that are:
  • filtered away on the agent/collector/connector
  • not parsed into the database by vendor's choice
  • not parsed by mistake / agent bug
  • not retailed for long enough on the database ...

4 comments:

Anonymous said...

Hmmm. I'd like to clarify my comments a bit here and respond to yours. First, SIEMs - if used correctly - have every capability required to make that guarantee. Rather, what's lacking is the correlation architect's (and even the vendors) clear understanding of what it is they're building.

When you create SELECT rules without some sort of (at least) simple ontology, you can't predict that you're doing a complete SELECT. This is a lot like IDS signatures looking for a list of known bad stuff - it'll never complete.

However, if you define your inclusive as well as exclusive processing, you can in fact have something smart done to all events. The way Ive built it in the past has been to create multiple correlation paths: Known Behavior, Statistical Core Processors, and Behavior Based Auto Classing.

Known Behavior rules are self evident.

In Statistical Core Processors, you create rule-series which perform one single statistical transformation per series (kurtosis, increase over average, etc). You must remember, though, to derive your measurements and environment boundaries from known variables, so you'll need some sort of automated system to re-baseline those numbers as part of the preprocessing and feed those numbers into the system.

The statistical stream should be split into two by source: Events unprocessed in other correlation streams, Events processed by at least one of the other two streams. The split can take the form of tuple tagging (/correlated/knownbadtype/statisticalchange, for example).

When we say Statistical Core Processors here, we mean that only a basic, simple transformation is made so that later in the stream, once events have all been accounted for, you can combine the outputs of these simple transformations into more complex ones.

The third stream, Behavior Based Auto-Classing deals with the failure of vendors to properly label information (like whats SPAM, in this example). Classing (labeling) information should best be done based on its behavior vs some hacked on system by utilizing any known properties possible (not all scenarios have known properties...in those cases, the events will at least fall into the statistical processors). Example: Email ends up in an inbox that has never been used before is spam, by definition. IDS events triggered by this mail can be classed as "SPAM" events whether or not the vendor labeled the events as such: Theyre either directly related to spam or else are generic enough not to be useful in differentiating between one actual activity and another. So, an automatically generated list of events associated with SPAM is sent to ArcSight or another SIEM, and the SIEM then classes/groups/tags those events as SPAM...and that tagging can be criteria for other rules, reprioritized, displayed, or filter ed out.

In conjunction, these three correlation streams make sure that no events fall out, you have created definable, repeatable criteria for every event, and you have known properties and facts about every single event that goes through your SIEM.

Interestingly, this also means you have created a system that will automatically place objects into predefined ontological classes. Huh. Thats really cool.

And I guess that's my comment on your comment: SIEMS can be used for more than selective real-time analytics. I've used complex time/day of week algorithms to bucketize my events to create more accurate pictures of the state of the network over time in the SIEM - which is useful for in-depth analysis. I used a number of tools (visualizations included) to look for times of day/days of week where traffic patterns were typically the same and most statistical measurements were automatically derived by comparing only similar temporal buckets to each other. That helped get rid of the start-of-business-day and its-the-weekend effects on the values.

Whew. Sorry for the long response. :)

Anton Chuvakin said...

Thanks a lot for the comments; I will think a bit and then post the follow-up :-)

Anonymous said...

I agree with your updates.

Specifically, I think the model of filtering at the edge with a static filter that lives on the edge itself is a huge problem (vs, for example, posting a temporary filter list at edge agents that is generated through correlation at the center and which expires as the assumptions the correlation engine made expire...and is reupdated at the same time).

What, though, can you do for the other issues?

Anton Chuvakin said...

What specific other issues? I'd say that if you have all the data and can always go back to it, other issues of data availability will be gone. Issues related to making sense of all this data will remain, however...

Dr Anton Chuvakin