Many times when folks are planning for recovery of a system, it’s because they anticipate that the system will crash or will be unavailable in a fairly traditional way. (Is there a traditional way for a system to crash???)
However, as I was reading a post about the premise that the “Majority of banks [are] still vulnerable to cyber attacks” and what that means for companies, it was pretty clear that the collective plan for recovery needs a new spit-shine and consideration about what the goals are, how thing will need to be done, and what to expect.
To start, it’s been said before and bears repeating, transparency and communication are fundamental to your sanity. And, of course, to the sanity of your user base. Keeping people up to date (on a regular basis, not a constant flow of information that would get in the way) keeps your user base happier and keeps the phones from ringing at a time when your attention must be on recovery. It’s not transparency down to the level of statements and operations, but rather a schedule, some milestones, some indication of awareness on your part, and that things are being addressed.
If you’ve ever been the victim of a downed system, NOT knowing that THEY know what is going on, and have a plan, is one of the most challenging places to be. Don’t be that person.
The new challenge, though, is providing for recovery that may NOT include “up to the last transaction” type recovery. In other words, you may need options when it comes time to bring things back to life. If the issue wasn’t discovered immediately, there’s a real possibility that you’ll be looking at recovering back to a specific point in time.
Then you’ll need to validate that information.
Then you’ll want to have a plan to get things back online once again.
Then you’ll need a plan to bring things current if at all possible. It may be as “simple” as having to re-enter data, or it could be a complex process of rebuilding data systems based on data flows and other information that was used to create the data in the first place. Also, there may be end-customer notifications, and extended downtimes as systems are brought back up to speed.
These aren’t the type of recoveries that fare well through a clustered recovery or a server replacement. These are recoveries where you have to take time to figure out the last known good data, then figure out how to get there with the systems you have in place. From there, the recovery steps should be able to be determined for getting you back to “now.” If you can.
It’s a valid point, I suspect, that many companies don’t have a plan in place for this type of multi-faceted recovery. Most are “here’s how you restore” or “here’s how fail-over works.” But cyber-attacks, and the fact that they may go undetected for a bit, can really throw a monkey wrench into that process.