The maximum acceptable down time after a computer system failure is determined by a companys

Recovery objectives are the foundational metric for building your disaster recovery strategy. Applying a quantifiable metric to the disruption that is tolerable to your business can help guide your evaluation of backup and recovery solutions to consider. Building your backup and recovery strategy based on your recovery objectives can provide you with confidence that when disaster hits, you are ready to recover with minimal data loss and impact on business processes and protect your business’s brand.

Why is understanding the difference between RPO and RTO critical for disaster recovery solutions?

Understanding the difference between RPO and RTO is critical in your planning for disaster. Knowing the maximum amount of time your business can tolerate being offline (RTO) and how much data loss is tolerable for business impact (RPO) can help shape your backup and recovery strategy and answer questions like what types of backups you should run for certain business-critical applications and how frequently those backups should take place, for example.

What is a recovery point objective?​

A recovery point objective, or RPO, is the maximum amount of data that can be lost before it causes detrimental harm to an organization. The RPO indicates the data loss tolerance of a business process or an organization in general. This data loss is often measured in terms of time, for example, 5 hours or 2 days worth of data loss. A zero RPO means that no committed data should be lost when media loss occurs, while a 24 hour RPO can tolerate a day’s worth of data loss.

How do you calculate a recovery point objective?

There are five steps to consider when you calculate your recovery point objectives:

  1. Frequency of your file update: RPO needs to, at minimum, match the frequency that your files are updated. By doing so, the delta between new data and backup data will be minimal, reducing the risk of data loss.
  2. Align RPOs and Business Continuity Plans (BCP): Different parts of your business may require different RPOs based on the criticality of data. Highly-critical applications that require an “always-on” approach, will require more stringent RPOs while other applications or departments may not need the same recovery objective.
  3. Consider Industry Standards: The RPOs are dependant on business-critical applications. However, as a guideline, you can consider the industry standards for a particular industry.
    1. Zero to one hour: You use the shortest time frame for business-critical data or workloads, typically because they’re high volume, dynamic or difficult to recreate.
    2. One to four hours: Consider this time range for applications deemed semi-critical, where some data loss is acceptable.
    3. Four to 12 hours: A time frame of this length might get used for business units that update daily or less frequently.
    4. 13 to 24 hours: Setting longer RPO time frames for important, but not critical, data and business units rarely exceed 24 hours.
  4. Establish and approve each RPO: Once the RPOs are established, they must be approved by the IT department and stakeholders. Additionally, it is important to keep clear documentation as a baseline and records.
  5. Analyze your RPO settings consistently: It is wise to always evaluate and optimize your RPOs. When you test RPO and evaluate performance, you can make any adjustments as needed, providing even better protection for your data.

What is a recovery time objective?​

A recovery time objective (RTO) is the maximum tolerable length of time that a computer, system, network or application can be down after a failure or disaster occurs. An RTO is measured in seconds, minutes, hours or days. It is an important consideration in a disaster recovery plan (DRP).

The amount of time that is used to determine the maximum a company can bear is directly linked to the application and its impact on the business; any loss of data affects revenue-generating activities. So, quantifying the impact of such losses will be a key factor in determining how to configure the environment to achieve the desired RTOs.

Calculation of risk

Both RPO and RTO are calculations of risk, providing measurements for how long a business can tolerate being offline from a disaster. As previously stated, these recovery objectives are often measured in seconds, minutes, hours or days. Even with taking the appropriate steps for calculating recovery objectives, the amount of risk is complex to quantify as it is unique to each application, dataset and company. Ultimately, it is important that ALL the stakeholders invested in the availability of your business’s applications and data agree on the quantity of risk associated with downtime. After all, there is typically a single IT organization servicing the business, and they will ultimately need to implement, manage and monitor the overall backup and recovery solution.

How to define RTO and RPO values for your applications​

When defining your business’s RTO, consider:

  • The cost per minute/hour/day of an outage
  • Are there recovery SLAs in place with customers?
  • Which applications or systems are a priority for being restored?
  • What is the ideal order in which critical applications need to be recovered?

When defining your business’s RPO, consider:

  • How much data, if any, can you stand to lose?
  • What are the potential financial implications?
  • What are the potential legal implications?
  • How does data loss affect your brand?

Testing RPO and RTO​

How can you have the confidence to meet objectives if you don’t regularly test your plan? While there are many best practices for testing recovery objectives, the most important practice is to actually perform the testing. This does not come easy or cheap in many cases considering the amount of time and storage potentially required to complete the testing. Some things to consider when planning recovery testing are:

  • The best testing schedule to meet SLA requirements
  • The time required by your solution to recover the data or workload to an operational state
  • The storage requirements for data recovery, storage and compute requirements for workloads
  • Automation to ensure repetition and reduce errors

Ongoing monitoring and analytics

As with any IT solution, ongoing monitoring and analytics help to ensure that the infrastructure and solution are functioning as designed and without failure. Nothing is more important than ensuring you can recover your business’s data. To increase backup success, which leads to reliable recovery, consider adding the following to your process:

  • 24/7/265 monitoring to ensure that backups are completed with no errors
  • Backup infrastructure monitoring for common issues that could affect backup success
  • Analysis of usage trends to prevent future issues with backup storage capacity

Recovery objectives are the foundation of your disaster recovery strategy and are critical to align to your SLAs. Ready to dive deeper? Watch this recorded webinar to learn more about how to reliably achieve your recovery objectives with Veeam by:

  • Aligning your objectives with supercharged backups and instant recovery
  • Avoiding RPO and RTO violations with automatically scheduled tests
  • Keeping backups safe from cyberthreats and avoiding reinfection

Watch the recorded webinar here.

Recovery point objective (RPO) is defined as the maximum amount of data – as measured by time – that can be lost after a recovery from a disaster, failure, or comparable event before data loss will exceed what is acceptable to an organization. An RPOs determines the maximum age of the data or files in backup storage needed to be able to meet the objective specified by the RPO, should a network or computer system failure occur.

An organization’s loss tolerance, or how much data it can lose without sustaining significant harm, is related to RPO and is set forth in the organization’s business continuity plan (BCP). This also dictates procedures for disaster recovery planning, including the acceptable backup interval, because it refers to the last point when the organization’s data was preserved in a usable format. For example, an RPO of 60 minutes requires a system backup every 60 minutes.

RPO recovery point objective is a time-based measurement of the maximum amount of data loss that is tolerable to an organization. Also called backup recovery point objective, RPO is additionally important to determining whether the organization’s backup schedule is sufficient to recover after a disaster.

The recovery point objective is critical because at least some data loss is likely when a disaster strikes. Even real-time backups cannot entirely prevent data loss when large-scale failures occur.

RPOs can determine:

  • How much data will be lost after a disaster or event
  • How frequently you need to backup your data for disaster recovery purposes—in other words, RPO does not concern other IT needs

Often, high-priority applications demand tighter RPOs, which will require more frequent backups. In these situations, the IT department must schedule backup systems that can satisfy such RPOs, such as the combination of snapshots and replication (also known as near-continuous data protection, or near-CDP). When RPO is near-zero, the team will combine failover services and continuous replication, or a continuous data protection system (CDP) to create nearly 100 percent availability for applications and data.

Recovery point objective and recovery time objective (RTO) are among a data protection or disaster recovery plan’s most important parameters. These objectives can guide the selection of an optimal data backup plan, as well as offer bases for identifying and analyzing viable strategies which could enable the enterprise to resume business processes within a timeframe at or near the RPO and  RTO.

Although these two terms are related, it is important to understand the difference between them.

Every BCP sets forth a maximum allowable tolerance or threshold for data loss during a disruption. The recovery point objective (RPO) describes the amount of time that can pass during an event before data loss exceeds that tolerance.

Example: An outage occurs. If the RPO for this business is 12 hours and the last good copy of data available is from 10 hours ago, we are still within the RPO’s parameters for this business continuity plan.

In other words, recovery point objectives of a recovery plan specify the last point in time the IT team could achieve tolerable business recovery processing given how much data will be lost during that interval.

The recovery time objective (RTO) is the amount of real time a business has to restore its processes at an acceptable service level after a disaster to avoid intolerable consequences associated with the disruption. The RTO answers the question: “How much time after notification about the business process disruption should it take to resume normal operations?”

Another way to think about the difference between recovery time objective and recovery point objective is that RPO represents a changing amount of data that will require re-entry or may be lost during network downtime. RTO represents how much real time that can pass before the interruption impedes the flow of normal business operations unacceptably.

Recovery time actual (RTA) and recovery point actual (RPA) are always the elapsed time and lost data of an actual recovery process and are often different from these objectives. Only business disruption and disaster rehearsals can expose these actuals.

As mentioned above, RPOs and RTOs will differ based on application and data priority. Near-zero RPO and RTO for all applications are very costly, as the only way to ensure no lost data and 100 percent uptime is by ensuring continuous data replication inside failover virtual environments.

Due to the cost of a near-zero RPO, prioritize data and applications to match the expense of achieving the right RPO and RTO based on purpose, risk, and costs. RTO is concerned with systems and applications, meaning its calculation deals more with time limitations on application downtime than data recovery.

This is another way to express the difference between recovery point objective and recovery time objective: RPO is focused on how much data is lost after a failure. Bad user experience and irritated users are the realm of RTO, but RPO covers catastrophic issues such as the loss of hundreds of thousands of dollars in customer transactions.

Here are several examples of recovery point objectives in action:

In the case of a business that uses traditional tape backups, consider a backup plan that schedules backups twice a day at 6 AM and 6 PM. A primary site failure at 2 PM allows the team to restore from the 6 AM backup an RPA of eight hours. The RTA will be driven by how long the restore takes, followed by any additional work necessary to return the system to full operation.

Continuous replication and continuous data protection (CDP) offer more secure RPO guarantees, since the target system holds a mirror image of the source. Depending on whether the replication is synchronous or asynchronous and how fast the changes are applied, the RPA values change. RPA depends on how quickly the application can access the data on the replicated site.

In some situations a business may require granular item recovery capabilities. For example, a user may delete important company files attached to email communications, and then empty the contents of their trash folder. Email is a business-critical application for many enterprises, so this is an application that IT might backup continuously, allowing for granular backup and recovery of a deleted file with an RTA of several minutes.

As another example, an e-commerce retail site likely uses multiple databases for different purposes. It stores its product catalog in a relational database, historical order data in a document database, and connects to its payment processor’s gateway via an API.

The RPO for the document database is within 24 hours because IT can reconstruct data for it from other databases. For this relational database, RPO is not critical because the business only adds products periodically. But if the database goes down, revenue stops, so RTO is more critical for this database, so the RTO might actually be shorter than the RPO.

RPOs can be set based on the frequency at which files are updated. This confirms your restored operations contain the most up to date version of your data following a service interruption. For example, frequently updated files need a short RPO of no more than a few minutes to ensure IT can restore operations with minimal data loss following a disruptive event.

Factors that can affect RPOs include:

  • Maximum tolerable data loss for the specific organization
  • Industry-specific factors—businesses dealing with sensitive information such as financial transactions or health records must update more often
  • Data storage options, such as physical files versus cloud storage, can affect speed of recovery
  • The cost of data loss and lost operations
  • Compliance schemes include provisions for disaster recovery, data loss, and data availability that may affect businesses
  • The cost of implementing disaster recovery solutions

Once defined, RPOs serve to detail the goals of the BCP, and each business unit should have distinct RPOs. For example, financial transactions and other mission critical data processes demand shorter RPOs than less frequently updated files such as personnel records.

As you calculate RPOs for your business units, consider these sample intervals:

0 to 1 hour
This is for critical operations that cannot afford to lose over an hour of data. They are dynamic, high volume, and difficult or impossible to recreate due to the number of variables involved. Patient records, banking transactions, and CRM systems all fall within this tier.

1 to 4 hours
This interval is for semi-critical business units that can afford data loss of up to four hours’ worth of data such as file servers and customer chat logs.

4 to 12 hours
Business units in this tier might include sales data and marketing.

13 to 24 hours
These business units handle semi-important data, and their RPO should go back no more than 24 hours. This tier may include purchasing and human resources, for example.

Druva’s cloud-native disaster recovery solutions offered as-a-service provide flexibility when it comes to enterprise RPO needs. With Druva, users also lower TCO by up to 60 percent and remove the burden of legacy architecture, unifying disaster recovery, backup, and archives in the cloud.

Druva customers can meet RPOs ranging from minutes to one hour, depending on the workload being protected. This is thanks to Druva’s use of source global deduplication, which allows backups to run quicker while consuming fewer resources than traditional backups.  This allows for running backups much more frequently throughout the day, versus traditional backups that only run at night due to the resources they consume.

Read the Druva blog to learn more about understanding RPO and RTO, and watch the video below.