Recovery time objectives (RTOs) and recovery point objectives (RPOs) help enterprises measure the allowable time for critical software to be down and backup intervals. They’re useful tools in disaster recovery (DR) and business continuity procedures, and they’re indispensable for organizations that need to closely track backup requirements.
These objectives are similar but measure different things:
- Recovery time objective (RTO) measures the duration of time an application can be down without considerably damaging business operations.
- Recovery point objective (RPO) measures the amount of data that can be lost without considerably damaging business operations or finances.
RTO and RPO assist businesses with backup and recovery management as they prioritize their data and applications to serve customers and maintain stable IT systems.
What is recovery time objective (RTO)?
An organization’s recovery time objective is the measurement of time an application can be down without causing significant damage to the business. Some applications can be down for days without significant consequences. Some high-priority applications can only be down for a few seconds without incurring customer frustration and lost business.
RTO is not simply the duration of time between loss and recovery. The objective also accounts for the steps the company’s IT team must take to restore the application and its data. If IT has invested in fail-over services for high-priority applications, then they can safely express RTO in seconds. Fail-over services automatically switch to another server or platform when one goes down. IT must still restore an on-premises environment, but since the application is processing in the cloud, IT has more time to bring on-premises technology back.
Your RTO mission is to categorize applications by priority and potential business loss and match your resources accordingly. For example, typical plans for near-zero RTOs will require fail-over services; 4-hour RTOs allow for on-premises recovery, starting with bare-metal restore and ending with full application and data availability.
How to calculate RTO
To calculate your enterprise’s RTO, follow these guidelines:
- Determine your most critical applications and systems. To decide which these are, ask the following questions:
- Which user-facing applications need to be constantly available for our customers?
- Which applications are most critical for our revenue operations to function? (Examples include CRM systems and self-service client platforms.)
- Which application is the front line of security in the business? (Examples include a next-generation firewall (NGFW) and extended detection and response (XDR) solution.)
- Which data stores supply the company’s most critical applications? (An example is the all-flash arrays that hold customer data for a CRM, like Salesforce.)
- When a critical application goes down, how much time passes before it negatively impacts a client’s experience? (This can be gauged by taking customer surveys and running IT user tests.)
- When a critical application goes down, how long does it take to be restored?
- If a critical storage system goes down or loses data, how long does it take to restore that data from backups?
Note that RTOs will likely differ for different applications. Also, keep in mind that even non-customer-facing critical applications may still need rapid restoration, if they provide crucial IT or security services to the business.
4 ways to achieve RTO
To achieve your organization’s desired RTO, follow these guidelines:
- Set realistic expectations up front. If your RTO seems to be unreachable, you may need to:
- Hire more personnel. More hands on deck may be necessary to bring a critical application back online.
- Invest in better backup software. If you’re using a legacy backup solution that isn’t meeting timelines, your organization may need to implement a modern backup or disaster recovery (DR) platform.
- Improve your critical applications. If teams consistently have trouble bringing a system back online, the code itself may require more advanced engineering.
- Configure real-time alerts. That consistently and immediately give feedback when an application is having trouble. These alerts should reach personnel on the appropriate platforms and devices.
- Implement a high-performance disaster recovery or business continuity plan (BCP). For some large enterprises that can afford a more advanced recovery platform, this will make a huge difference in achieving RTO expectations. Although these solutions take time and stakeholder approval to select and deploy, they’re valuable resources, especially if your organization has many mission-critical systems.
- Limit low RTOs to a select few applications. Most organizations can’t maintain very short RTOs for many systems; it’s expensive to make and store backups every few hours for every single company application.
What is recovery point objective (RPO)?
An organization’s recovery point objectives refer to the company’s loss tolerance: the amount of data that can be lost before significant harm to the business occurs. RPO is expressed as a time measurement from the loss event to the most recent preceding backup.
If you back up all or most of your data in regularly scheduled 24-hour increments, then in the worst-case scenario, you will lose 24 hours of data. For some applications this is acceptable. For others, it is absolutely not.
For example, if you have a 4-hour RPO for one application, then you will have a maximum 4-hour time period to backup before data loss. Having a 4-hour RPO does not necessarily mean you will lose 4 hours of data. Should a word processing application go down at midnight and come up by 1:15 a.m., you might not have much or any data lost. But if a busy application goes down at 10 a.m. and isn’t restored until 2 p.m., you could lose 4 hours of highly valuable, perhaps irreplaceable data. In this case, arrange for more frequent backup that will let you hit your application-specific RPO.
Depending on application priority, individual RPOs typically range from between 24 hours down to near-zero measured in seconds; 8-hour-plus RPOs may be able to take advantage of your existing backup solution, as long as it has a minimum impact on your production systems; 4-hour RPOs will need scheduled snapshot replication, and near-zero RPOs will require continuous replication. In cases where both the RPO and RTO are near-zero, combine continuous replication with fail-over services for near-100% application and data availability.
How to calculate RPO
To determine how much data can be lost without significant damage, follow these guidelines and choose time ranges for all your important systems.
- Run tests to determine how quickly data needs to be available for each enterprise application. These include cloud storage platforms, CRM solutions, and e-commerce applications.
- Categorize all your main enterprise applications depending on their backup restoration requirements. For example, does the data need to be restored within a few minutes, or can it wait a day?
- Calculate your company’s financial position regarding backups. For example, for how many applications can the organization afford to maintain fail-over and replication services? Does your business need to store some backups offline, on hard drives, or tape?
- Choose a couple of top-priority applications that require immediate restoration. If the servers that support your main cloud platform have continuous replication, they’ll be restored rapidly. But a database of previous sales contacts may be backed up with hard drives, and they’ll take hours to restore. Most businesses can’t afford rapid backups for all of their software.
4 ways to achieve RPO
- Be realistic. Like RTO, your IT and business continuity teams must be realistic about recovery objectives. If you set an RPO of 6 hours for your HR platform, make sure that the backups can be reasonably recovered in 6 hours. Before setting RPOs, test backup rates for different fail-over and replication services, tape, hard drives, and flash arrays.
- Train staff. All personnel involved in the backup process should be trained to respond quickly and promptly when an incident or outage occurs. Each team member should know exactly what to do when a system goes down.
- Upgrade technology. Ensure that services like replication and fail-over are supported by reliable modern technology. If replication and fail-over services are hosted on an old, unreliable server, they may not work as quickly.
- Network optimization. Similarly, optimize network performance, so the business network can support data backup rates. If the network is bogged down, your organization may not be able to meet its backup times.
Realistic examples of RTO and RPO in use
The following scenarios show how businesses can achieve their RTO and RPO goals.
Granular email recovery
A company attorney accidentally deletes a time-sensitive email, then empties the contents of the trash folder. Since Microsoft Exchange is a business-critical application for this busy company, IT continuously backs up delta-level changes in Exchange. And since their backup application is capable of granular backup and recovery, they can recover the individual message within an RTO of five minutes instead of restoring an entire VM for a single message.
A store’s self-hosted e-commerce site uses three different databases: a relational database storing the product catalog, a document database that reports historical order data, and an API database connecting to their payment processor’s gateway. The document database can reconstruct data from other databases, so its RTO and RPO are within 24 hours. The business only adds products to the relational database once a week, so RPO is not critical. But RTO is: if the database goes down, then customer transactions stop.
To keep it highly available, the company invested in a fail-over service, so the database immediately spins up on virtual servers. The company replicates the few changes it makes during the week to their provider’s DR platform. The API database holds ordering information and needs both RPO and RTO in seconds. IT continuously replicates data to the fail-over site, which immediately takes over processing should the API database go down.
CRM software that’s hosted on premises at a company’s main Florida office goes down when a bad storm hits. The server room is damaged. However, all data from the CRM is backed up at a data center facility in Missouri. Because the CRM platform was one of the most important applications at the company, the executive and IT teams prioritized replication and fail over, so the data center backups were replicated just minutes before the storm hit.
The team members responsible for restoration — many of them out of state, not working directly in Florida — follow the escalation process in their disaster recovery plan immediately and are able to meet their RPO of 15 minutes.
Read about disaster recovery as a service for businesses.
Are there tools to help achieve RTO and RPO?
The following backup and recovery solutions provide tools to help businesses meet their RTO and RPO requirements, safeguard their data, and optimize their application performance.
Acronis offers disaster recovery as a service, with a single-interface management console and support for both physical and virtual systems. It’s available both on premises and in the cloud. Users can automate common DR scenarios. Acronis enables users to move production workloads from on-site locations to virtual machines in a cloud data center if their on-site environment fails. Acronis also offers a recovery solution for service providers, intended to help managed service providers (MSPs) offer 15-minute RTOs.
Druva offers on-premises fail back and cloud fail over for user accounts in all AWS regions. Druva CloudCache allows teams to achieve low RTOs and RPOs in environments with limited bandwidth. Druva’s cloud disaster recovery features single-click fail over and fail back, and it automatically executes teams’ run books, which detail each step of a DR plan.
Veeam Backup & Replication offers backup services for a wide variety of environments, including virtual systems and SaaS platforms. Veeam’s Disaster Recovery Orchestrator, an add-on for the Availability Suite and the Backup & Replication solution, provides DR scenario testing. Veeam also offers backup as a service and DR as a service.
Backup and DR provider Rubrik’s mass recovery product allows enterprises with large volumes of files or virtual machines to recover them quickly. Rubrik users can manage policies for frequency of snapshots and duration of retention. Rubrik’s logically air-gapped backups use zero-trust ideology to keep the backed-up data or application segregated from other environments within the infrastructure.
Learn more about the best backup and recovery solutions for your business.
Bottom line: Prioritizing your enterprise’s backup and recovery goals
For businesses that want to strategically meet data backup goals, managing RTO and RPO is critical. They give organizations specific metrics to hit when developing backup and recovery strategies. Determining KPIs for backup and recovery makes heavy and overwhelming tasks more manageable.
The challenge for all organizations is prioritizing applications and deciding which need more financial investment — in other words, how many instant recovery solutions can the business afford? Knowing each system’s RTO and RPO can help them decide.
If your organization measures RTO, don’t exclude RPO, and vice versa. Both measurements reveal the company’s most critical applications and data and what’s required to keep them functioning for successful operations.
Learn more about creating disaster recovery plans next.