Beyond Business Continuity Management: Building Resilience for Extreme but Plausible Events in a Post-COVID-19 World

Douglas Wilbert, Managing Director Risk and Compliance
Dugan Krwawicz, Associate Director Technology Strategy and Operations

The concept of an extreme but plausible event is a moving and expanding target. Over time, our thinking on what can be deemed implausible or improbable will continue to evolve. The magnitude of impact from real events will reshape our view of what today is considered extreme.

It is true that organizations and their business continuity management (BCM) teams often contemplate scenarios that are reserved for incredibly creative minds. However, with every calamitous event, such as the ongoing COVID-19 pandemic and the September 11, 2001 attacks, the severity of the human, financial and property losses is a stark reminder that even the best BCM playbooks can fall short of helping organizations to fully plan and prepare for disruptions.

As the pandemic stretches on, BCM teams are expected to continue to apply lessons learned and evolve their methods so they can respond effectively if and when another COVID-19-like event occurs. Based on recent events, probabilities that organizations have assigned to tail-risk events are likely to move closer to the mean. This migration of extreme events towards the mean implicitly means that subsequent events – or what some regulators are defining as extreme but plausible events – will be of greater consequence to firms. Below we discuss a few things that can help organizations build resilience in this changing risk environment: Understanding BCM basics and the importance of regulations and standards; addressing concentration risk; and tapping into the unique benefits of cloud computing.

Why BCM Matters

BCM is the design, development, implementation and maintenance of strategies, teams, plans and actions that provide protection over, or alternative modes of operation for, those activities or business processes which, if they were to be interrupted, might bring about seriously damaging or potentially significant loss to an enterprise. The concept has existed in various forms for as long as there has been business but became more formalized in the 1970s. In the early days, the focus of BCM was on keeping mainframes operational, an effort that involved maintaining water-cooled pipes at proper temperature. Over the years, the need has morphed from keeping mainframes cool to cooling data centers, and most mainframes have been repurposed with racks of servers.

Additionally, the business continuity threat landscape has expanded considerably to include both internal and external events, as well as extreme but plausible events. (Read Protiviti’s recently published BCM Top 15 Frequently Asked Questions to learn more about BCM basics, evolving concepts and related practices). Similarly, other aspects of BCM have evolved as the needs of businesses have increased – the volume and complexity of their reliance on third parties, the scrutiny of shareholders, the rigors of compliance, and the multi-faceted expectation of customers. Given the present challenges with COVID-19, and with operational resilience being enforced increasingly, the practice of BCM will no doubt continue to evolve.

Regulation and Standards

Regulatory compliance is a key consideration when planning and strategizing about how to develop and implement an enterprise business continuity program. Not only is it good business practice, but business continuity is a discipline enforced by regulatory requirements.

Financial industry-specific BCM standards and guidelines are emulated as best-of-breed approaches by businesses in other industries. For example, the BCM IT Handbook from the Federal Financial Institutions Examinations Council (FFIEC) is a standard often relied on by businesses in other industries because of the high level of rigor involved in its development and its high expectations for financial institutions.

The challenge with BCM standards is that they are not intended to address, nor can they reasonably account for, the nuances of every industry or the uniqueness of the company implementing the program. Competing priorities often bend to budgetary limitations, often impacting the timeline for execution and the quality of the standards that are implemented eventually. Discipline on the part of management and a reasonable right-sized approach are critical to success for those industries not under the microscope of a governing body.

Addressing Concentration Risk

For asset portfolios of financial services institutions, diversification is a long-standing method of addressing concentration risk, especially when assets are flagged as overweight in a specific class or to a specific counterparty. Diversification strategies typically have the support of investors and industry and often are touted by regulators as a viable method to decrease risk. Additionally, stress testing can be used to generate metrics on concentration risk in portfolios and methods to alleviate it. 

Unlike investable assets, the operational risk of firms cannot be managed with modern portfolio management theory, which enables optimization using tens of thousands of computer simulations to determine possible risk outcomes. Operations have redundancies and separation, and resilience must be built into the system. The ability to understand the operational impacts of concentration risk is limited to testing, and sometimes, the scripts and iterations of tabletops, simulations and live testing used by most organizations can provide a false sense of confidence about the effectiveness of those strategies.

A more effective approach is for organizations to continuously weigh their concentration risk against cost and the ability to recover from an extreme but plausible event within an appropriate predefined time. Beyond business continuity and in the operational resilience realm, these predefined times will be measured not by the traditional recovery time objective (RTO) but the impact tolerance of the most vulnerable stakeholder. As the focus on resilience increases, regulators are more likely to deem impact tolerance to be the proper challenge to concentration risk. If a firm cannot recover an important service or process within the defined impact tolerance, investment in enhancements will be required. 

When contemplating concentration risk, firms should look beyond the usual BCM functions and rather approach with a resilience lens. Here are additional key considerations:

  • Assessing the geographic concentration of critical assets should not be based on a specific number of miles or kilometers between two locations performing the same function. It should be based on those factors that can simultaneously affect the two locations regardless of the distance between them.
  • Understanding how a workforce will respond to a geographically specific event and leveraging relevant contingency plans are critical. Technology will allow for an anywhere-anytime workforce, which is the standard all firms should strive to achieve going forward.
  • Assessing and monitoring third-party risk are important, but it should not end there; fourth- and fifth- party risks should also be assessed and monitored. As the use of shared services such as the cloud expands, firms must be diligent about not allowing outsourced functions to inadvertently exacerbate concentration risk.

Focusing on the Virtual

Though its abilities and limits have yet to be tested, cloud computing presents the most promising opportunity for organizations to increase their resilience against extreme but plausible events. By creating a virtual computing environment, the cloud becomes a scalable ecosystem where damage to, or the inoperability of, any single component of the ecosystem would not have a significant effect on the overall system. Beyond cost, what makes the cloud enticing to organizations is that, within a single migration, it can achieve the highest form of resilience, accounting for any type of extreme but plausible event. 

Most services and operational processes struggle to achieve the same level of resiliency the cloud affords. This challenge should not preclude organizations from applying the benefits of cloud in internal development or to guide their redesign of existing processes to ensure their businesses are achieving the highest form of resilience possible for all processes. Striving to achieve a process where there is no singularity of reliance, where redundancy exists, and where there is virtualization, will relieve stress on the system. 

As we navigate the effects of COVID-19, the virtualization of processes becomes even more relevant. There will undoubtedly be a different “normal” environment when workers eventually return to the workplace. The new environment will assuredly increase the distribution of the workforce for many, and current remote working arrangements may remain intact based on personal, firm or governmental reasons. Regardless of the motive, firms should consider moving to a virtual environment to not only accommodate the new normal working environment but to also increase the overall resilience of their operations.

Add comment