“Prepare for the worst and hope for the best.” We’re all pretty good at taking care of the second part of the idiom, but when it comes to an organization’s digital infrastructure and data, the first should never be neglected. Even short periods of downtime can cause substantial disruptions to customer & user experience, sales and revenue, not to mention reputation damage. There will always be the possibility of things going wrong, but having a robust disaster recovery and data backup plan can minimize data loss and downtime and ensure business continuity.
RPO and RTO are two key metrics you’ll want to look at for each of your applications and systems if you want to establish a proper disaster recovery procedure:
RPO vs. RTO – what’s the difference?
RPO (recovery point objective)
RPO stipulates the acceptable amount of time between the last backup and the resource outage – that is, the tolerable amount of data loss. When recovering your data, you want your backup to be as up-to-date as possible. However, the more often you back up, the more storage and resources you will require, which will drive up costs. When it comes to making consideration for RPO in a disaster recovery plan, it will essentially establish the frequency of data backups.
The narrower the time frame between the last backup and a return to normal operation, the more airtight (but expensive) your disaster recovery plan will be.
RTO (recovery time objective)
RTO is straightforward: how long should it take you to restore normal business or service operation following a disruption to its functionality? This is usually a matter of hours.
Optimizing your disaster recovery configuration
In a perfect world, RPOs & RTOs are close to zero and restoration of normal operations is quick. However, the costs involved make this unrealistic – the shorter the measures, the higher the cost. Besides, the main goal of disaster recovery is business continuity, not speed, so you can consider several factors when deciding on what your RPO & RTO should affordably be:
- Maximum tolerable data loss for your specific organization
- The sensitivity of your business’ information (some types of records may not need frequent backup)
- Type of data storage (format, private vs. cloud, etc.), which can affect the speed of recovery
- The overall cost of implementing data recovery resources
Tiering availability across an organization
The nature of your business and the technology it requires will determine the importance of having shorter objectives. For example, if you run a brick & mortar store, you could probably go a few hours having systems down with minimal detriment to business continuity. High-traffic, revenue-generating websites, however, will need a much quicker turnaround.
Furthermore, importance will vary across specific applications or services, which is why many organizations create a tiering system for their stack in order to govern adherence to their vendors’ availability SLAs (service-level agreements).
Systems can be split across a few different tiers according to their criticality, or how essential the data is for the organization’s success. A tiering model might look like the following:
- Tier 1 – Mission-critical applications – RPO: near-zero, RTO: 1 hour
- Tier 2 – Business-critical applications – RPO: 2 hours, RTO: 4 hours
- Tier 3 – Non-critical applications – RPO: 2 days, RTO: 1 week
Payroll, transactional and customer-facing systems, for example, are often categorised Tier 1, whereas something like redundancy or parental leave information – which are often legally required to remain on record for several years or more – are non-critical, and can be categorised in the lowest tier.
These tiers would be established with a business impact analysis (BIA), assessing which services are most essential to the business and its operations.
Establishing reliable RPOs & RTOs
In practice, there will be a difference between the objective figures and the actuals (Recovery Time Actual and Recovery Point Actual), but the objectives can be reliably set with regular disaster event rehearsal. Too many businesses or providers don’t back their data up frequently enough, and even when they do, the backups aren’t tested as rigorously as they should be. Make sure your service provider is both testing their data backup capabilities and able to commit to RTOs and RPOs.
Altis Cloud is the WordPress hosting platform for enterprise-scale websites. All plans come with RTO and RPO commitments specified, along with compliance and testing reports for your IT Teams. We can also perform custom DR tests and reports inline with your own DR testing program. Book a demo with us and we’ll show you around the platform, or get in contact to receive pricing information.