Now, I am not some world-famous DLP analyst, but it doesn't mean that I cannot have an opinion on this "searing-warm" :-) security concept: "data leak 'prevention'" or DLP (notice the double quotes around prevention...)
I admit that in the past I poked jokes at DLP for being "ADLP", with "A" standing for "accidental." Indeed, most of the technology approaches I've seen were "good enough" for preventing accidental leaks (e.g. Excel sheet with SSNs being emailed to an external party by mistake) and for preventing truly idiotic "insider" attacks of the same nature. Whether they sniffed or used desktop agents, the tools were good enough to do the above, but not much more (or, they allowed you to do more, but via a truly ginormous effort by your security team). And then a retarded kindergarten kid can bypass them in his sleep without working up a sweat ...
In other words, DLP was for keeping honest (but sloppy) people honest and keeping idiots idiotic (but a bit safer). Which is, don't get me wrong, pretty darn useful: after all, overall, employee mistakes still cause more damage than hackers (!)
However, whenever I heard about DLP, I always felt some deeper longing for more - maybe for a technology that CAN actually stop some, clearly defined classes of malicious data theft, perpetrated by non-idiots.
What such technology might be? Well, IMHO, it should have three things:
- Easy on the end user (=information owner) - thus no manual information tagging needed (don't you know, its dead!)
- Easy on the tool operator (=security team) - thus no super-granular policy-writing needed (and please - spare me the regexes!)
- Effective enough to stop malicious insider of reasonable skill over specific information channels- thus, some new technology for accurate detection of possibly modified documents across channels (e.g. common network)
Tough to match? Yup, it sure it. But that's not all: I'd like it to defend against theft of structured, unstructured and structured->unstructured (e.g. database contents pasted to email!) information over just about any network channel (not device theft and not USB/portal device download - these are a different story). What's more, I think that to enable #3 above the DLP "box" needs to actually understand what the document is about and to do it in a human-like fashion (Yes, including rephrased (!) content. Yes, I am picky :-)).
The above clearly does NOT mean that the technology is not bypassable - there is always an encrypted zip file and gpg, custom encrypted network protocols, or even a screenshot emailed, etc (not even going to device theft, USB xfers or camera phone + screenshot + MMS). It just means that it takes DLP a few big notches up from "anti-retard defense" to blocking a malicious and dedicated non-IT employee from stealing the crown jewels.
And, if one is trying to be honest about DLP, he need to define what is out of scope (after all, only narrowly defined problems are actually solvable in this space, not "our MagicBox 6.1 will block ALL data theft," which is absurd - if you believe that, you need your head examined).
I was pretty shocked to learn that something like this actually exists today: the next wave of DLP start-ups is about to emerge. For example, NexTierNetworks can detect information traces even in modified and heavily edited documents (I would like to try rephrasing as well; I suspect it will work!). When I saw a demo I was pretty impressed that you can get a financial document, change a few things here and there, paste it to email - and the system will still stop it by saying "uh-uh, this is sensitive info, no can do" :-) Mind you, this is not what current DLP vendors call "fingerprinting," since it actually uses what the document is about i.e. works on a - hate the word! - semantic or meaning level. So, DLP + a bit of NLP (the other NLP) = magic :-)
As a disclosure, I have to say that I just joined their Advisory Board, but, as you can guess, I joined because I am impressed (not "impressed because I joined!" :-))