1 Definitions and Applications of Business Continuity, Operational Resilience, and Organizational Resilience, and How the Concepts Relate to Each Other

Recent global events have demonstrated organizations’ vulnerability to disruptions and emphasized the need to plan beyond mere emergency response to assuring resilience and continuity of operations over time. Organizations must strategize their preparation for such occurrences. The concepts of business continuity management (BCM), operational resilience (OPR), and organizational resilience (OGR) all address strategies for organizations to proactively prepare for and respond to the shocks and stresses of perturbations and disruptions. This article examines these concepts as tools that planners can apply while responding to emerging disruptions. This section examines a sampling of literature that explores BCM, OPR, and OGR to consider the existing definitions and their implications for the relationships between the concepts. This article highlights inconsistencies that have left these concepts poorly defined in relationship to each other.

1.1 Business Continuity Management

In the early 2000s, the British Standards Institutions developed the code of practice BS25999, which was released between 2006 and 2007 to support a consistent framework for BCM. This was replaced by the International Organization for Standardization (ISO) with standard 22301 in 2012 (Sanchez Dominguez 2016) and updated in 2019 (ISO 2019). This standard is broad: it refers to BCM as a management system that seeks to protect against, reduce the likelihood of the occurrence of, prepare for, respond to, and recover from disruptions when they arise.

Despite the BCM’s roots in the Cold War (Folkers 2017) and numerous discussions of BCM in the early 2000s (Hecht 2002; Gallagher 2003; Herbane et al. 2004; Hiles 2004), Bajgoric (2014) argued that BCM had not yet been defined comprehensively, and that it represented an extension of existing risk management. Indeed, some definitions frame BCM within risk discourse processes where disruptions are assumed to be known and steps can be taken beforehand to ensure that the system is prepared to withstand such risks or perturbations when they arise (Mcilwee 2013; Faertes 2015; Zeng and Zio 2017; Groenendaal and Helsloot 2020). Other authors expand BCM to include recovery from events (Herbane et al. 2004; Niemimaa et al. 2019; Azadegan et al. 2020; Gartner 2021).

The range of these definitions for BCM suggests that its application should enable an organization to both avoid and overcome disruptions both small and large. In this context, there are no firm boundaries delineating what BCM is not.

1.2 Operational Resilience

In current use, OPR has been referred to as both a specific property of the financial sector and/or as a lower-level (tactical) form of short-term planning and delivery of business continuity processes. In the financial sector, definitions of OPR show consistency in a focus on delivering critical operations during a disruption (Basel Committee on Banking Supervision 2020), where critical operations are those which, if unavailable, threaten an organization’s very survival. Essuman et al. (2020) define OPR as encompassing both the absorption of and recovery from a disruption in the operational domain. Poudel et al. (2019) refer to OPR as a system’s ability to respond and recover, where the framework for the recovery can include infrastructure, as well as service provisions. Liu et al. (2020) focus on disruptions to infrastructure and the responses necessary to restore service, and Lv et al. (2020) similarly focus on system configuration in achieving OPR. The OPR metric in Phillips et al. (2020) quantifies the magnitude and duration of a disturbance that the system can withstand, but says nothing about recovery, the focus of many of the other definitions.

There are ambiguities in how OPR relates to BCM. Hofbauer and Quirchmayr (2021) treat operational resilience as a precondition for BCM, and Al-Yaeeshi and Al-Ansari (2022) use resilience as part of their BCM assessment. In contrast, Gartner (2021) suggests that increasing the foresight and complexity of BCM applications produces OPR, a framework that presents BCM as the foundation upon which OPR is built. It follows that, according to Gartner, the increased complexity is in the inclusion of operations in a BCM framework. Gartner’s hierarchy, where BCM supports operational resilience, is reflected by Manning and Soon (2016); they posit BCM as a management process that enables OPR.

However, this distinction suggests logically that BCM’s methodologies constitute OPR only when they are applied to operations, and thus BCM itself cannot be applied to operations without making the concept of OPR redundant. The fact that BCM has no clear boundaries muddles its relationship to OPR. It cannot both subsume OPR and contribute to OPR.

1.3 Organizational Resilience

Interest in OGR has grown in academic research throughout recent years (Hepfer and Lawrence 2022). Attributes include everything from short-term coping to long-term transformation. However, practical and academic conceptualizations of OGR continue to lack consensus (Duchek 2020; Hillmann and Guenther 2021). Carmeli and Markman (2011) and Ismail et al. (2011) define OGR with respect to an entity’s expansion strategies rather than a response to a disruption, as others have done (Sutcliffe and Vogus 2003). Similarly, Linnenluecke (2017) describes OGR as an organization’s ability to respond quicker when compared to other businesses or develop different approaches to business, including market fit, in order to maintain business function. Wood et al. (2019) present OGR as subsuming everything within an organization that can plan for, absorb, recover from, and adapt to a disruption. Hillmann and Guenther’s (2021) review found 22 distinct attributes to defining OGR. The most frequently cited attributes are an ability to adapt, to cope, and to reinvent/reconfigure—and these are applied to several domains, including awareness or sensemaking, stability, and behavior, among others.

These definitions do not provide a clear idea of how OGR relates to either BCM or OPR. In Gartner (2021), the three concepts are considered as a continuous hierarchy, ordering from BCM to OPR to OGR by expanding the complexity and foresight of each, respectively. This hierarchy assumes that OPR and BCM are both necessary to achieve OGR, but does not specify what new complexity is added to constitute OGR. Furthermore, because BCM had no bounds, the definitions of OGR do not differentiate themselves from BCM in substance; the general idea is that the organization has weathered the disruption.

2 Defining the Concepts Through Dyads

This article argues that the current formulations of these concepts are overlapping and lack differentiation, leading to ambiguities in their relationships to each other. The existing conflicts make practical application of these concepts difficult. Increasing consistencies of these business management concepts will improve the planner’s capacity to prepare for and respond to complex events and scenarios. The following section attempts to differentiate between the three concepts and suggests a framework to operationalize methods and tools relevant to the application of each concept following a disruption. Three dyads are utilized to distinguish between the three concepts.

2.1 Operations and Organization (Processes and Assets)

The existence of dual concepts for OPR and OGR suggests that there is value in their differentiation. It could be that one encompasses the other, as Gartner (2021) suggests. However, this article proposes that there are distinct domains in which the concepts operate. An organization’s “processes” comprise the logistical efforts that enable the throughput needed to complete the organization’s objectives. In a conventional sense, the supply chain provides raw materials; assembly lines combine raw materials into new products; employees learn and apply established company methodologies; and, marketing and vending ensures that an organization’s products and services are distributed to customers or clients. An organization’s processes thus relate to its verbs: what it does.

An organization’s “assets” relate to all the tangible or intangible resources that have value to the organization (ISO 2018)—the people, infrastructure, or capital that enable carrying out the processes. Testing for bloodborne disease requires both a skilled nurse to draw blood and a laboratory with equipment to run the test; producing electricity would serve no purpose without the infrastructure to distribute it. An organization’s assets thus relate to its nouns: who and what carry out the processes.

Here, processes are placed within OPR and assets within OGR. Processes can be interrupted, and assets break or become unavailable, thus each can be the subject of disruptions. Furthermore, disruptions can cascade between the two domains, perhaps underscoring Gartner’s assertion that OGR cannot exist without encompassing OPR. This article holds hold that they are distinct, and separate resilience plans should be tailored for each domain, especially since this might prevent or mitigate cascading risks and failures.

2.2 Risk and Resilience

Risk and resilience management each characterizes a response to a disruption that ensures that the system is able to achieve a high-level of performance (Linkov and Trump 2019). Perfect risk management, which we call robustness, ensures that a system is prepared for all emerging disruptions, and does not suffer setbacks because of them. Galaitsi et al. (2021) contrast robustness with reliability and resilience to show that resilience accepts that not all disruptions can be anticipated and not all performance declines can be avoided. Resilience management lays the groundwork for the system’s recovery following the absorption of the disruption (Bostick et al. 2018). Figure 1 shows the high point of critical function performance (Point A) attained by both robustness and resilience, as well as their different pathways to arrive there. The first inflection point determines whether a system will show robustness or decline, and the second inflection point determines whether the decline will be routed to enable system resilience.

Fig. 1
figure 1

Source After Galaitsi et al. (2021).

Robustness versus resilience.

Inflection Point 2 in Fig. 1 marks the moment where the performance begins to recover towards reattaining its prior level. When risk management is too narrow in scope, resilience planning can still ensure that the system arrives at point A (Linkov et al. 2014; Linkov and Trump 2019). However, the timing and magnitude of the reduction in function can affect a system’s ability to recover: the length that a function is underperforming, and the extent to which it underperforms can both affect its probability to cascade into another function, or between the domains of operations and organization. For example, without sufficient OPR, a break in the supply chain for personal protective equipment (PPE) for medical staff can cascade to the organizational domain, such as medical staff health and availability. The longer that PPE is unavailable or the poorer the quality of the substitutes used in the meantime, the likelier it becomes for medical staff to contract illnesses. Depleted performance can necessitate compensations in other functions, which creates opportunities for cascades. Thus, the resilience curve in Fig. 1 can be steeper, indicating a faster return to high or critical performance and also reduced opportunities for cascades during the period of lower performance.

This article defines OPR and OGR as each fulfilling resilience in their respective domains. They are hence applicable only after a critical function has lost capacity following a disruption (Inflection Point 1) rather than displaying robustness. In contrast, we assert that BCM can encompass both risk and resilience management, meaning robustness and resilience can each constitute BCM. Since all three concepts thus incorporate resilience, the next dyad further differentiates between levels of resilience.

2.3 Normal Operating Conditions and Crisis Conditions

Following a disruption, we parse that a critical function that declines in performance after Inflection Point 1 (Fig. 1) will fall into either normal operating conditions or crisis conditions (Fig. 2), which we differentiate below. Figure 2 envisions the conditions on the axis of critical function performance.

Fig. 2
figure 2

Normal operating conditions versus crisis operating conditions

We define normal operating conditions to include the robustness pathway (Fig. 1), but also include some ability to degrade provided that the operating conditions do not have to substantially change in response. For example, if a PPE supply chain is disrupted, it would be robust to have supply redundancy that could allow staff to continue using PPE at the same rate, but normal operating conditions could also encompass using PPE for longer periods, as long as staff does not feel it is soiled or unsafe. In contrast, crisis conditions require a substantive change in operations. The crisis threshold is passed when staff begin using trash bags for gowns or soiled face masks because such behaviors would have been unacceptable under normal conditions.

According to Fig. 2, normal conditions encompass both robust or resilient responses and can also include remaining at some lower-level of performance within normal operating conditions. Under crisis conditions, performance can be resilient; it can fail completely; or, it can again stay at a lower-level of performance within the crisis conditions, if such a steady state performance is possible. Normal conditions include robustness, which crisis operating conditions do not, and crisis can include failure, which normal conditions do not. Both conditions can include remaining at a reduced steady state or resilience performance. But resilience performances in normal versus crisis conditions assume different magnitudes, wherein resilience in the crisis condition must be of a higher magnitude to attain the original performance level.

The crisis threshold is both somewhat subjective and moveable based on asset characteristics, as well as external conditions such as government policies. For example, when the U.S. CARES Act provided businesses with short-term loans during the early COVID-19 outbreak, it cushioned businesses and effectively increased the magnitude of disruption that could be experienced without crossing the crisis threshold. In Fig. 2, this amounts to moving the crisis threshold further from the optimum functionality level on the y-axis.

We distinguish between normal operating conditions and crisis conditions to be able to differentiate between BCM and either OPR or OGR. As seen in the BCM definitions, the concept contains aspects of both robustness and resilience, however the idea of continuity implies some level of general stability, which we characterize as normal operating conditions. The resilience (performance) it exhibits is in response to relatively minor dips in functionality in that realm. This article assumes that normal operating conditions present minimal opportunities for disruptions to cascade, in contrast with crisis conditions, and thus differentiating between domains in normal operating conditions would serve little purpose. It follows then that BCM exists for disruptions in either the operational or organizational domain, so long as the corresponding dip in performance does not cross the crisis threshold (see Fig. 2). Table 1 synthesizes definitions of each of the three concepts.

Table 1 Definition by dyads

The dyads enable distinguishing between the three concepts. BCM encompasses both risk and resilience and processes and assets, but only in normal operating conditions. OPR and OGR are specific to resilience processes and assets qualities, respectively, in crisis conditions.

3 Illustrative Example of Dyads Applied to Case Studies

The following section explores explore examples of the dyads within case studies of disruptions that organizations encounter.

3.1 Disruptions in Which Performance Remains Within Normal Operating Conditions

Most organizations will eventually encounter disruptions that they are not robust to. For that matter, there is also a frontier from which resilience capacity may be exceeded and market failure is the ultimate consequence. Whether an organization can remain within normal operating and address its challenges by BCM depends on the scale of disruption, and the subsequent magnitude to which the performance is impacted during the period of absorption. Disruptions that may remain within the realm of normal operating conditions might include the departure of an employee, the failure of minor machinery, or a supply chain disruption. These challenges are met with continuity—assuming that once the initial disruption is overcome, no major changes are necessary in the organization’s assets or processes.

3.1.1 KBR Private Company

Disruption: Hurricane Florence (2018) caused USD 17.9 billion damage in North and South Carolina, including KBR’s Wilmington office, NC.

Evidence that Performance Suffered: The Wilmington team had to vacate the property, and employees faced road closures and flooding that made travel in the area difficult.

Organization’s Response: Through logistical support from KBR’s Houston headquarters, the company found temporary housing for employees unable to reach their homes and assisted with other problems to support employees to quickly get back to work. Damaged equipment was promptly replaced by the Houston office, and employees were able to resume work remotely (KBR 2018).

From the company description, the company managed to quickly provide substitutions for lost assets, and this resilience provided sufficient continuity for the overall company. Although the Wilmington employees had trouble working for a period of time, the overall impact to the organization, at least as reported by the company, does not seem to have been broadscale enough in the company’s global operation to have constituted a crisis. Therefore, to determine which concept characterizes the response, no distinction is needed between assets and processes because remaining in normal operating conditions implies BCM, which applies to both assets and processes.

Dyad 1

Dyad 2

Dyad 3

Outcome

Assets

Normal operating conditions

Risk management

Business continuity management

Processes

Resilience

3.1.2 Martha Stewart Living Omnimedia

Disruption: After Martha Stewart, owner of Martha Stewart Living Omnimedia, was found guilty of insider trading in 2004, she paid a fine and served five months in jail.

Evidence that Performance Suffered: When the news of the investigation broke in 2002, the stocks for the company plummeted from USD 954 million to USD 162 million (ABC News 2004).

Response: The company continued its operations without major external changes. Stewart was released in 2005 and the company began posting profits again in 2006 (NBC News 2006). While such a dramatic change might seem to imply crisis management, the fact that the company did not display any major outward changes in its approaches suggests that it continued to operate within normal operating conditions, which implies that the response was BCM.

Dyad 1

Dyad 2

Dyad 3

Outcome

Assets

Normal operating conditions

Resilience

Business continuity management

3.2 Disruptions that Decline Performance into Crisis Operating Conditions

Next, the following section outlines consider examples of crises, which tend to be more newsworthy and can encompass both OPR and OGR. The disruptions described in these examples constitute significant problems for their organizations, as evidenced by how their absorption is characterized: a company approaching bankruptcy, or a company unable to provide adequate customer service, to the irritation of its customers. The disruptions the organizations face create crises, and the resilience required to overcome the subsequent decrease in performance is substantial.

3.2.1 The Airline Industries

Disruption: The onset of the COVID-19 pandemic.

Evidence that Performance Suffered: During March 2020, global air traffic levels dropped 21% compared to March 2019 with a further drop during April to global air traffic reaching 66% compared to the previous year (Liu et al. 2021). Global demand reduction, especially in tourist and business travel, caused a decline in airline profitability, thus requiring some changes in operating conditions, which corresponds with crisis operating conditions.

Response: The U.S. federal CARES Act assistance limited the impacts of demand reduction, allowing the airlines to continue operating as though in normal conditions despite the changes. In practice, the U.S. CARES Act assistance assured BCM for the airlines. But when the assistance began to end during the autumn of 2020, the processes that the airlines had previously followed were no longer suitable to the reduced demand. This process disruption cascaded into the asset domain: United States-based airlines began to announce thousands of furloughs (Josephs 2021). Airlines also found themselves with too many planes, and started retiring planes earlier than anticipated (Dube et al. 2021). Had the demand remained low, the asset reduction might have solved the airlines’ profitability problem, but crisis arose as demand returned but the assets that had been shed during lean times did not come back online quickly. This constituted a secondary disruption. However, offloading planes was a staggered process that represented OPR because some planes, in the most severe application, were off-loaded under dry-leases or otherwise liquidated, while other planes were simply removed from airworthiness classification and placed in the dessert for later mobilization.

Evidence that Performance Suffered: Long lines at airports and canceled flights seen across the United States during the summer of 2021 (Nguyen 2021) and again in 2022 (Abend 2022).

Response: The airlines made efforts to hire more employees (Kunzler 2022), trying to balance fluctuating demand with a supply in the form of staff (assets) that can be kept on the payroll to ensure the processes can again run smoothly.

Dyad 1

Dyad 2

Dyad 3

Outcome

Assets

Crisis operating conditions

Resilience

Organizational resilience

Processes

Operational resilience

3.2.2 LEGO

Disruption: The LEGO Group grew throughout the twentieth century to become a top 10 global toy manufacturers in 1990 (Davidson 2020). This growth, however, began to stagnate and then decline throughout the latter half of the 1990s. Numerous factors contributed to the crisis faced by the LEGO Group, namely patent expiration and new electronic toys arriving on the market that shifted children’s focuses (Andersen and Ross 2016). In response to these business shocks, the LEGO Group attempted to invest in other areas, but these initial efforts were largely unsuccessful.

Evidence that Performance Suffered: Within limits in profitability, the disruption became a crisis that necessitated a change, including lay-offs (French 2006). In 2003, the company was USD 800 million in debt (Davis 2017). In terms of dyads, this appears to be a crisis.

Response: The company shifted focus to direct engagement with the consumer, and the target consumer was expanded to include adults through the introduction of more-advanced LEGO sets, such as the LEGO Architecture sets, as well as directly targeting girls through the LEGO Friends sets, increased interest in the product (Davis 2017). In doing so, LEGO changed their output and cultivated new demand. Since this shift, LEGO has continued this model of consumer-informed growth by, in 2018, introducing LEGO’s made with sustainable materials and the LEGO DUPLO set that blend both traditional and electronic play. Changes in operations were necessary, as well as the personnel able to perform them. With these changes, the company showed OPR and OGR, demonstrated by their high profits during the subsequent decade (Capon 2016).

Dyad 1

Dyad 2

Dyad 3

Outcome

Assets

Crisis operating conditions

Resilience

Organizational resilience

Processes

Operational resilience

4 Discussion and Recommendations

With this understanding of the concepts, planners can begin to parse meaning and hence strategies for assuring high performance despite anticipated or unanticipated disruptions. The strategies follow the same dyads articulated above and can help position organizations to respond productively to disruptions as they arise.

4.1 Tally Critical Assets and Processes

To prepare for BCM or either type of resilience, planners should first assess which assets or processes are critical for the organization’s survival. Appraisals may create a hierarchy ranging from the most important processes to processes that can be neglected for varying periods of time without impacting the organization’s viability, and the same for assets. Planners should consider assets and processes that are both vulnerable and insulated from disruptions.

4.2 Plan for Robustness and Resilience of Assets and Processes

Once critical assets and processes are tallied, managers can begin strategizing for robustness or resilience. Robustness can take the form of easy substitutions, such as stockpiles of critical equipment components or a list of alternative suppliers from a different part of the supply chain. Robustness could also include policies, like credit hours, that can encourage employee to work overtime work as needed to compensate for deficits by providing them more leave at another time; this may help avoid burnout. The KBR Private Company example shows that dips in performance can be met by expending company resources to get people back to work quickly, even extending to helping them with relocating their homes. Making these types of resources available can provide BCM even when strict robustness is not possible. BCM can entail both robustness and resilience within normal operating conditions. At some point, however, preparing for only robustness becomes cost prohibitive, as not every eventuality can be anticipated and fully absorbed by contingency plans. When robustness is no longer financially justifiable, resilience planning can ask different questions about how to bring assets or processes back online after an interruption.

4.3 Understand the Potential for Cascades

Crisis conditions arise when disruptions are sufficiently significant that it is not possible to address them within normal operating procedures. In our view, this frames the division between BCM and forms of resilience, and their respective responses. A crisis places demands on assets and processes that may make them more vulnerable, such as increased use of crematories during the spring wave of the COVID-19 in New York in 2020. If crematories do not have sufficient time to cool down, they can break and exacerbate the crisis of long wait times. Employees can also be pushed to the brink, as with the now-evident burnout of medical professionals following years of managing the COVID-19 pandemic (Kaushik 2021). The question then, in a crisis, is where to set boundaries for crisis operating conditions to preserve organizational sustainability. There may be situations where the correct answer is to push the assets until they break; but in most situations, that will cause the disruption to cascade. Assets and processes are interconnected and pushing one to failure will likely create more problems than it will solve; the two examples of crises conditions provided in this article affected both assets and processes. However, establishing boundaries requires understanding system interdependencies. The tallying necessary for crisis planning requires not just knowing the assets and processes, but how they are connected. Planning can include scenario building, such as stress testing critical infrastructure resilience, and may require clearer definitions of parameters and thresholds (Linkov et al. 2022a, 2022b). For either OGR or OPR to be successful, the organization must struggle through the crisis itself and then set a path to recover and to end the crisis. This process will benefit from minimizing the cascades as much as possible.

In addition to minimizing cascades, the work of resilience in a crisis may require adapting to new conditions through ingenuity and accurate interpretations of the cause and impacts of the crisis. For OGR, firms should understand the external factors that affect assets or potential replacements of assets. For example, the airline companies’ staff shortages may be exacerbated by the Great Resignation or the quarantine requirements for staff members with positive tests. Links and relationships must be understood, and adaptations must be made based on the best available knowledge and projections of favorable outcomes. Similarly, OPR may require adaptations based on changing circumstances, whether that be new technology or materials, or new policies and operation processes. For LEGO, it meant capturing and transforming to new markets; the internal marketing unit was critical for OGR. Adaptations should be selected to be useable and beneficial based on existing assets and processes.

5 Conclusion

The concepts of business continuity management, operational resilience, and organizational resilience have not been clearly differentiated in the literature. This article presents three dyads that demonstrate the overlaps and distinctions for each concept, which will allow planners to strategize for continuity or resilience following a disruption. Such planning requires identifying assets and processes for the organization, opportunities where disruption of one might create a disruption of the other in a cascade. This can support organizations in providing continuity, avoiding crises, or decreasing the time it takes for an organization to emerge from a crisis.