In reading through a post about a book that’s out about managing and living with data, there were some very interesting observations outlined. Things that are pretty apparent as you work through systems at companies with many different sources of information and raw data. The post was a book excerpt, from “Winning With Data” (Wiley).
If you think about it, it’s quite true that people are in a bit of an information battle right now. When you consider data use, data exporting, data importing, data subjects, GDPR, privacy regulations, even data ownership silos in a given company, you quickly see that wrangling the data needed to create information is quite the project. One of the recent trends is walling off data that is in a company. Not from a security standpoint (though that does seem to be improving a bit), but from a “this is mine, I’m responsible for it, and you can’t have it” take-your-toys-home-and-leave-my-sandbox point of view.
This partially comes from the massive data agreements that are now in place to support things like privacy policies and the GDPR. They instill responsibility for data abuse and apply rules and management requirements that can get incredibly difficult to provide for once the data loses the control of the department that found it in the first place, so they hold on more tightly. But it also comes from the fact that there are so many platforms in play at the moment. Cosmos DB, DynamoDB, SQL Server, all of the different data platforms and processing systems.
People will build out a series of systems that “touch” their data, get it just the way they want it, then lock it down. Rather than share the raw information or even findings in a reusable way (because that can get really difficult to do), it’s buttoned up and controlled.
All of this – from the legal controls to the ownership to managing that information – leads to data deserts a bit – or bread lines. It’s a great analogy.
Then you end up with this (from the post): “Overly delayed by the strapped data team and unable to access the data they need from the data supply chain, enterprising individual teams create their own rogue databases. These shadow data analysts pull data from all over the company and surreptitiously stuff it into database servers under their desks. The problem with the segmented data assembly line is that errors can be introduced at any single step.“
It’s very clearly a trend in the systems that are being built out. The thirst for information is beginning to outweigh the risks associated with not getting it absolutely right. Decisions have to be made, data has to be used. More information is needed.
Personally, I think it’s a key challenge facing data folks right this minute. Managing the flow of information into and through organizations is a challenge, a risk and a critical target that needs to be addressed. Everywhere you look people are busy. Busy working around the “right” thing to do with their information, and busy getting their stuff done with what bread (or data) they can find to get the answers. Any answers. The accessibility of resources, from Office 365’s shared data environments, to local databases and spreadsheets, to cloud databases (and data sources and data manipulation), it’s all there, easy to get ahold of, not so easy to make sure things are being done correctly.
Data discovery tools are starting to appear (there are cloud-based services that can monitor looking for personally identifiable information in files on the network, there are tools you can use to work through your SQL Server and look for private information, etc.), but we need to be thinking about comprehensive ways of helping people address their data-lust.
We’ve moved on from storing it all – accessing it all – now it’s all about devouring it all.