Yes. I’ve actually heard that several times lately. We’ve been talking about recovery plans, backups, the “oh boy, Steve’s preaching test your backups again” documentation and so-on. As we’re talking through the things they need to test and have experience with and document, they get this shifting-around, anxious to move on attitude.
“But it’s the cloud. It’s not like it’s going to crash like the old days, Steve.” They like to emphasize “old days” for me just to poke at me.
I ask them to explain, let them talk a bit and then congratulate them on having perfect data, perfect developers, perfect users and perfect security precautions in place. Usually, that’s answered with a blank stare.
See, backups aren’t just because your server crashed. I grant you that up-times and accessibility are running high-9’s these days. And, if it’s down, the reality is that it’ll likely be back up before you could do a lot with it to recover anyway. All of that is quite true.
But then we get into why we need backups. Beyond the outright failure of systems… we get into data issues, programming issues, access issues, even some compliance requirements. Some are recovery from a specific issue, some are more broad. Some are covered by using the recovery tools provided by your provider, some are less-so. Knowing where those lines are and where you apply the different options you have. If, when something happens, you’re left to figure out options and limitations of each, you’ll be wasting some very valuable and high-stress time that could have been spent implementing the steps needed to get the job done.
There’s a great post over on MSSQLTips – it talks to this very thing. (Here’s a link to the post – check it out) They’re talking about understanding the backup/restore models for SQL Server on Amazon RDS. Yes, features and capabilities will update, things will change, but you’ll never know if you’re not out there, doing it, understanding it, revising your plan for when something goes wrong. Some unexpectedly rogue operation updates too many items and the world stops, stares at you for the cure and waits… Just look at the various options even under RDS – and there will be similar options for Azure and other providers and database platforms.
Each of the options brings you to a potentially different point – a different data set. You need to understand these. Test them. Document your recovery process.
I started this whole bit of a rant thinking “the holidays are here, time for the yearly reminders that we have to have instructions and processes in place if we want to have a holiday and not just be on-call for handle-able things.” So, yeah. That.
Write up the decision points (including yes, when to call you) and steps to take. Us the tools, know that “if x happens, then y is needed to recover and here’s what “recover” really looks like.”
Even if you’re on the magical cloud that never goes down. Data and information recovery is your goal for anything from a “oops, I updated the wrong percentages in that table and it cascaded to a bunch of other stuff” to “holy cow, what just happened to that database from that (shhhhh) security hit we just recovered from?”
You will NEVER be sorry if you know the answers. You may (some would say should) lose your job if you don’t.