Tuesday, February 22, 2011

On Cloud Logging Standards, Unique IDs and Other Exciting Logging Matters

Two of my esteemed colleagues, Misha Govshtein of AlertLogic and Raffael Marty of Loggly had a bit of an argument over something fairly central to logging and log management, especially as it applies to the coming cloud wave. Let’s review what happened.

In 2010, AlertLogic  folks have submitted an IETF draft of what they called “Syslog Extension for Cloud Using Syslog Structured Data”. Draft is available here and AlertLogic team explanation of its mission and purpose can be found here and  here (unfortunately in MP3 form). The draft reads as if they are proposing a new cloud log standard since the very first sentence of the document is: “This document provides an open and extensible log format to be used  by any cloud entity or cloud application to log and trace activities  that occur in the cloud.”

Said draft has found its way to the CEE Editorial Board (via IETF list message) and has caused some interest and, dare I say, unrest. And some disagreements. Raffael Marty of Loggly has published his position on the draft here. Further exchange of opinions can be seen in comments here, as well as heard in the hallways of RSA 2011 conference.

What do I think of this? I think both of these renowned log literati are both right and wrong (at this point, somebody might say “Anton…you are such a consultant”… and I am Smile)

Unquestionably, I believe that the idea of cloud logging having its very own special standard, completely disconnected from all other logs is misguided. Being disconnected from both the rest of the logging domain and current log standardization efforts (like CEE, XDAS, etc) only makes this idea more misguided. In essence, if you grab an example of a current bad application log, add “cloudiness” to it (more on this later) and then publish it as “cloud log standard”, you generate mostly hilarity and not value for the IT community. Logically, it goes like this:

  1. Bad log + cloud ID = really bad cloud log.
  2. Really bad cloud log + public IETF draft = really bad, standard cloud log, exposed in public
  3. Really bad standard log in the cloud EXPOSED in public = stupidity
  4. Stupidity –> funny blog posts from Anton, like, for example, this one.

This just reminds me of Chris Hoff saying “Cloud security suffers from the exact same siloed security telemetry problems as legacy operational models…except now it does it at scale.” In fact, here is an example from the draft:

Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
provider="example.com" rid="1:123"][transit client="172.16.1.82"]
User authentication successful for 1:123


Would YOU like to spend your mornings analyzing logs like this? If you expose such examples in a purported standard draft, future generations of log analysts will hate you with a passion….



However!



I also happen to think that there are significant differences of logging from/at cloud computing platforms (whether SaaS, PaaS or IaaS) compared to BOTH traditional system logging AND distributed application logging. Cloud computing (as defined by NIST) has inherent multi-tenancy, elasticity, immediate provisioning and other fun properties, not found in traditional applications and platforms – whether distributed or not. All of these happen to affect accountability, auditability and transparency – all the goals logs serve – in a number of big ways. Thus, cloud computing must change how logging is done and it will change it. Specifically, adding a unique ID (“audit identifier which uniquely  identifies an external request for activity”) to logs in order to enable serves a useful purpose.



So, we must change logging for the cloud AND we must improve logging  everywhere through standard work. It will result in GOOD, USEFUL LOGS that ALSO WORK WELL IN THE CLOUD. The caveat? We need it sooner than CEE is finished and adopted on a broad scale. “CloudLog” effort contains useful ideas that need to be implemented in future logs produced by cloud framework components, but the method chosen (uncooked IETF draft choke full of bad log examples) deserves mostly ridicule…

Dr Anton Chuvakin