Backups and disaster recovery get treated as the same thing. They are not.
A backup is a copy of data. Disaster recovery is the plan for getting the entire business operational again when something serious goes wrong. Having backups without a recovery plan is like having a spare tyre with no jack. The raw material is there, but the ability to use it under pressure is not.
Most businesses discover the difference during an actual incident, which is the worst possible time to learn it.
What disaster recovery actually means
Disaster recovery is the documented, tested process for restoring business operations after a significant failure. That failure might be a server crash, a ransomware attack, a cloud provider outage, a fire, or a critical software failure. The specifics vary. What does not vary is the need to answer two questions in advance:
How much data can the business afford to lose? This is called the Recovery Point Objective, or RPO. If backups run daily, the worst case is losing a full day of data. If backups run every hour, the worst case is one hour. For some businesses, losing a day of transactions is acceptable. For others, losing five minutes is not.
How long can the business be offline? This is the Recovery Time Objective, or RTO. If the website goes down, how many hours can the business tolerate before it is operational again? The answer determines how much infrastructure and preparation is needed. An RTO of 24 hours requires a very different setup to an RTO of one hour.
These two numbers drive every decision about disaster recovery. They determine how frequently backups run, where they are stored, what infrastructure stands ready to take over, and how much the whole thing costs. Most businesses have never defined them explicitly, which means recovery expectations are based on assumptions that have never been tested.
Why backups alone are not enough
Backups protect data. They do not protect the ability to operate.
Consider a business running a web application on a single cloud server with a database. Nightly backups of the database run to a separate storage location. That covers the data. But if the server itself fails, restoring the backup is only one piece of the recovery. Someone also needs to:
- Provision a new server with the correct specifications
- Install and configure the operating system and application software
- Restore the database from backup
- Configure networking, DNS, and security rules
- Verify the application works correctly with the restored data
- Update any integrations or third-party services that point to the old server
Without documentation and preparation, this process can take days. With a tested disaster recovery plan, it can take hours or less. The data was always safe. The ability to use it was not.
The same applies to more complex environments. An application spread across multiple servers, databases, queues, and third-party integrations has dependencies that are not obvious until something breaks. Restoring individual backups without understanding the order of operations, the configuration requirements, and the integration points produces an environment that does not work.
Common disaster scenarios
Disaster recovery is not only for dramatic events. The most common scenarios are mundane.
A server stops working. Cloud instances can fail without warning, and if the application runs on a single instance with no failover, that failure means downtime until a replacement is provisioned and configured. Ransomware is another common one: an attacker encrypts files and demands payment, and if backups are connected to the same network, they may be encrypted too.
Accidental deletion happens more often than anyone admits. Someone removes a production database, a critical configuration, or an entire cloud resource. Recovery depends on backups existing for that specific resource and someone knowing how to restore it.
Cloud providers experience regional outages several times a year. If the entire environment runs in one region with no cross-region failover, the business is offline until the provider fixes it, with no control over the timeline. And a bad deployment can introduce a bug that corrupts data or makes the application unusable. Rolling back the code is only part of the fix if the data was corrupted too.
Each scenario has a different recovery path. A disaster recovery plan documents what to do for each one, who is responsible, and how long it should take.
What a disaster recovery plan looks like
A practical disaster recovery plan does not need to be a hundred-page document. It needs to cover six things.
Start with what is being protected. List every system, database, and service the business depends on, ranked by importance. The customer-facing application and the accounting system are probably critical. The internal wiki is probably not. Then define recovery priorities, because not everything can be restored at the same time. Revenue-generating systems and customer-facing services come first.
Recovery procedures are where most plans fall short. These are the step-by-step instructions for restoring each critical system, detailed enough that someone who did not build the original environment can follow them. If the procedures are missing, outdated, or assumed to be obvious, the plan has a hole in it.
Responsibilities matter too. Who does what during an incident? Who makes the call to activate the plan, who handles communication with customers, who performs the technical recovery? Define these roles before an incident, not during one.
Include a communication plan. Silence during an outage damages trust more than the outage itself. And finally, set a testing schedule. A plan that has never been tested is a theory, not a plan. Regular testing, at least annually, reveals gaps and keeps procedures current. It does not need to simulate a full disaster. Restoring a database from backup and verifying the data covers a lot of the risk.
What this costs
Disaster recovery costs exist on a spectrum, and the right level of investment depends on the business.
At the low end, a small business with modest recovery requirements can implement a basic plan for very little: documented procedures, tested backups, and a clear process for rebuilding the environment. The cost is primarily engineering time, typically a few days to document, configure, and test.
At the higher end, businesses that need rapid recovery invest in standby infrastructure. A secondary environment in another region, ready to take over within minutes if the primary fails. Database replication running continuously so that data loss is measured in seconds rather than hours. Automated failover that switches traffic without human intervention. This infrastructure has ongoing costs, but for businesses where downtime is measured in thousands of pounds per hour, the investment is straightforward to justify.
Most businesses sit somewhere in between. They do not need instant failover, but they do need confidence that recovery is possible within a defined timeframe. That middle ground, a tested plan with proper backups and documented procedures, is achievable for any business willing to invest a few days of focused effort.
Questions to ask
These questions reveal how prepared the business currently is:
- If our primary system failed right now, how long would it take to restore service?
- Have we ever tested restoring from our backups? Did it work?
- Are our backups stored somewhere that a ransomware attack could not reach?
- Who knows how to rebuild our environment, and what happens if that person is unavailable?
- Do we have written recovery procedures, and when were they last updated?
If the answers are uncertain, that is the gap disaster recovery planning fills. The cost of planning is predictable and manageable. The cost of discovering the gaps during an actual incident is neither.
If this is an area that needs attention, get in touch. It does not take long to assess the current state and work out where the gaps are.