Does Privacy/Protection Impact Schema Design?

Many times working through systems you don’t have the luxury of starting from the beginning.  You’re working with an existing system, updating it and bringing it forward or adding new columns or what-have-you.

In cases where you ARE working from the start, do you find that privacy is impacting data design?  Not from a yes or no on whether you want to store some bit of information, but in *how* you store it.  In so many best practices, it’s all about “don’t save it if you don’t have to.”  But right here on SSWUG, it wasn’t too long ago that “save everything” was the mantra – you just never know what you’ll be able to make better use of in the future, so squirrel it all away now and have it when you need it.

This is really not a good idea with the data protection stuff and just general best practices.  It’s about mitigating exposure and controlling what someone would see if they had unwanted access or pulled some custom report.  Plus, the cornerstone of so much of the privacy stuff now is that the owners of that data (the individual) will have control over your keeping it – both now and in the future.  They can rescind that approval.  This means you will need to have specific options and tools to drop information the owner no longer wants you to have.

This will go for companies and purchases and interactions and all sorts of things like that as well. Basically any time you’re saving information about your customers, and it’s identifiable (and really, what good is it at a micro-level if it’s not, in this example) you’ll be managing these types of controls.

We find that we’re making decisions about schemas differently.  We’re also retro-fitting schemas pretty actively where we used to make certain we stored it all, we now dump it as soon as possible.  The actual schema process is changing too.  Do you need that full ID number (whether it’s a credit card or some other sort of ID) or can you just have the last 4 stored for confirmation in the future?  Do you ship products?  If not, perhaps the address is even up in the air.  How do you support “remember this information for next time” type solutions for your customers?

It’s becoming very complicated indeed.  I suspect the customer will take a usability hit somewhat, and that that will become OK.  The “saved account” stuff will be one of the first, and easiest items to go and from there, we can figure out what sales or other transaction information needs to be stored… and how.   We are finding that we’re recommending limiting things like id numbers, address information (if not needed), any personally identifiable (which is a shockingly hard thing to define on a specific business basis) – are all managed and thought very carefully about in terms of data types, storage mechanisms, and protections.  Add to that the customer-centric controls and things get more and more challenging.  How do you store a customer transaction for a purchase without having personally identifiable information, but still allow for verified returns?

So the answer, in our case, is yes.  In our systems and those of people we’re working with, we are changing the schema, changing the approach, changing the recommendations and changing the customer-facing tools as the model of data ownership flips around.  It certainly keeps things interesting.