Recently somebody asked me: what do I mean by LOG CONTEXT? And what is “log enrichment correlation?”
This picture explains it clearly:
For each element in the log message shown, you can gather some contextual information. Contextual here simply means that the information is gathered NOT from this particular log entry. For example, a log entry might contain an IP address, but its DNS name needs to be grabbed from the DNS server, which “enriches” the log entry and makes it more useful.
One of the ways SIEM and log management systems performs such enrichment is by gathering and displaying context information. Context information is the additional information required to make the limited details available within the log entry more meaningful. Context information does not come from the logs themselves [not from the entry in question – it might come from other logs], but originates in the surrounding IT environment such as other systems inside or outside the organization.
As I say above, one of the simplest example of context data is name resolution: the DNS names or Windows NetBIOS host names are added to the logs. While the log file may have already provided IP addresses, the added context of a human-readable name makes the log more meaningful. Normally, DNS names are not present in logs, but have to be obtained by queries to a DNS server. The SIEM tool might find context data in a variety sources, including:
- Windows name services, DNS and NIS servers: to map IP addresses to names
- Defined asset groups: internal or external status of an IP address
- Asset management systems: to gather information about systems, their ownership, compliance relevant of each system or group of systems
- WHOIS servers: WHOIS information for external addresses shows who owns them and where they are located
- Geo-location: show the physical location of the system
- Active directory and LDAP servers: to map user names to actual user identities
- Attack details and vulnerability information: to gather additional details about the log data and/or the log source.
BTW, back in 2008, I did a poll on what context is the most useful for log analysis. That is what came back – it shows that some useful context is simply documentation on what the log mean (might be pulled from the internal knowledge base of a SIEM product):