How To Write a Business Continuity Paper

I’ve recently observed people asking for research paper ideas if the subject is not strictly defined in the assignment requirements. If your professor gives you flexibility on the industry or topic to cover, I recommend writing about your target industry for employment and publishing your work on LinkedIn (if you don’t have a blog or relationship with established  editors).

This approach will come in handy when the hiring manager asks you for a writing sample or views your LinkedIn profile and discovers that you have written about how to solve one or more of their problems.

@MalwareTechBlog is the researcher who stopped the WannaCry outbreak.

It will also come in handy when prospective team members conduct their prerequisite cyber stalking.

On top of that, it is much easier to research a subject that you’re passionate or knowledgeable about.

See how knowledge sharing is mutually beneficial?

The abstract below may seem familiar because it appeared in my last post about how to write an abstract.  I’m including it here with the accompanying research for a complete product.

Btw, a colleague at Splunk pointed out that abstracts are required for call for papers submissions at conferences, so don’t sleep on developing this skill set. Thanks Paul Daigle.

Abstract

The purpose of this paper is to examine critical aspects of business continuity design for the fictitious ABC Power Utility Company, including preparation and testing options.

The problem with inadequate continuity of operations planning is that ABC can suffer revenue losses, loss of shareholder value, and extended business disruptions without procedures in place to respond to a disaster.

The paper will explore specific plan development, operational recovery choices, and assessment requirements. The examination method will involve a 2 year testing schedule proposal with a discussion on when it is appropriate to perform certain types of contingency reviews, including full backup and recovery.

Only five business units will be evaluated: I.T., Nuclear, Customer Service, Fossil, and Regulatory Affairs.  The role of digital forensics, cost considerations and business unit rankings will be presented to provide a holistic view of the risks and consequences of disruption. The report will conclude with a summary of key findings.

Keywords:  business continuity, business impact analysis, testing, IT contingency planning, digital forensics, backup and recovery, critical infrastructure, NERC-CIP

Introduction

The Industrial Control Systems Cyber Emergency Response Team (ICS-CERT) audited the energy industry and found that it “faced more cyber-attacks than any other industry sector from October 2012 through May 2013” (Dark Reading, 2013).

Despite the warnings, industrial control systems within power companies are not prepared to respond to a cyber-attack.

According to the 2016 ICS Threat Briefing, the U.S. power grid experienced a cyber or physical attack every 4 days and 44% of survey respondents said they were unable to identify the source of attacks (Booz Allen Hamilton, 2016).

The aforementioned ICS-CERT audit and ICS survey results demonstrate a lack of preparedness and gaps in forensics capabilities.

A crucial and complementary part of any incident response plan is a business continuity plan, which details how companies will return to normal operations in the event of a large scale attack or other IT related disaster.

In order to help one utility business prepare to respond to a disruptive cyber event, the student is conducting research for ABC Power Utility Company, a fictitious critical infrastructure organization that generates, transmits and distributes electricity service.

This paper will examine IT resource contingency planning and processes at ABC. The report will explore the preparation steps, recovery measures, and ways to assess whether the plan will be adequate to return the company to normal business operations in the event of a natural disaster or cyber-attack.

The scope of the report will be limited to ranking five critical business units: I.T., Nuclear, Customer Service, Fossil, and Regulatory Affairs.  Additionally, the student will propose testing recommendations over the next twenty four months with rankings based on the criticality of the asset or function.

Finally, cost implications for personnel, equipment, and production expenses will be presented. The report will conclude with lessons learned.  Let’s begin with the planning steps review.

Planning Steps

ABC Power should use an established methodology when developing its contingency plan. Since power utilities are governed by NERC-CIP, the student recommends following the Security Guidelines for the Electricity Sector published by the North American Electricity Reliability Corporation (2011).

Steps should include a business impact analysis (BIA), business unit alignment for recovery strategies, a plan framework, training and testing exercises, and policy/procedural maintenance. The following sections will discuss each step in detail.

Business Impact Analysis

A business impact analysis consists of data gathering exercises to determine critical processes, systems, business units, and other material information that are vital to business operations and goodwill of the company (Sikdar, 2011).

ABC Power should hire an independent BIA consultant to conduct interviews, workshops, and other information gathering sessions to ensure that all relevant data is captured. Engaging a specialist will ensure that management receives an unbiased analysis of the current resource capabilities and constraints.

There are three main functions of a BIA:

  1. Prioritize business units and data based on recovery point objective, recovery time objective, dependency on parent process, and quantitative financial impact.
  2. Define qualitative impact, including the morale, reputation, and public relations.
  3. Define quantitative impact, including matrix showing critical business units, likelihood of human, natural or technological disaster disruption.

Based on professional experience at a major power utility, the student recommends that ABC Power should focus on five critical business units:

  • I.T.
  • Nuclear
  • Customer Service
  • Fossil
  • Regulatory Affairs

The following matrix summarizes important data points.

Matrix Definitions

  • Financial Impact = low (<$1 million), medium ($1.1 to $5 million), high (> $5 million)
  • Recovery time objective = target time service or process must be restored after disruption
  • Recovery point objective = maximum target of time data or service may be disrupted
  • Response time rankings = immediate, moderate (one week), or low priority (over a week).
  • Qualitative impact (morale, reputation, PR) = ranked from 1-10 with higher numbers indicating worst impact.

Note: I used a table here because professors and bosses like tables. Do you table?

Business Unit Major Organization Activity (MOA) Financial Impact Recovery Point Objective (hours) Recovery Time Objective

(hours)

Response time Rankings Impact on morale, reputation, PR
I.T. Provides all essential information technology services to all business units High 1 2 Immediate 9
Nuclear Provides nuclear energy required to produce electricity Medium 2 4 Immediate 8
Customer Service Establishes new accounts, takes customer payments, and handles all account inquiries High 1 2 Immediate 10
Rating and Distribution Sets electricity rates and sells excess supply to other providers Medium 24 48 Moderate 8
Regulatory Affairs Manages statutory compliance and relationships with government entities Medium 24 48 Moderate 7

Business Unit Alignment for Recovery Strategies

Although the matrix identified five critical business units, ABC should include the legal department in the BIA process to ensure that no statutory requirements were overlooked in the analysis.

The BIA questionnaire should also include a section to document critical business units that stakeholders deem relevant (Nicoll, 2013). If there is a pattern of consensus for departments that were not involved in the BIA process, management should consider engaging those units.

After all BIA responses have been captured and analyzed, management should carefully review the recommendations from the BIA. The document will provide proposed recommendations to address gaps in the IT contingency planning and suggestions to mitigate the impact of losses from a disaster or attack.

Once all business units agree on BIA content and approves plan to improve resiliency, the organization must provide the necessary budget to implement the recommended changes.

Specifically, the business alignment process means that the business and IT should agree on the amount and configuration of hardware, services, and capabilities that are required to meet the desired recovery objectives.

Both groups will also have to agree on a time schedule of when these services and capabilities will be available and align an appropriate strategy for testing. Once these objectives are met, the recovery plan for the business units can be codified and adopted as the appropriate go-forward strategy. The next step in the plan is adopting a framework.

Contingency Framework

As an electricity provider, ABC will be best served by using the established framework as provided in NERC CIP-009-06: Recovery Plans for Bulk Electronic Systems (2014). NERC provides guidelines for identifying, classifying, and prioritizing a number of critical systems for IT resiliency planning including:

  • Bulk electronic
  • Control centers
  • Digital access control
  • Monitoring
  • Physical access control
  • Mission critical servers and applications
  • Supervisory Control and Data Acquisition
  • Industrial Control

In addition to the systems classification and prioritization, recovery team members should be established and documented. At minimum, this should include an executive level contact for each business unit, response team lead, legal liaison, and public relations representative.

Finally, the framework should include fully documenting the business continuity and IT recovery procedures.

ABC must also plan regularly scheduled communication campaigns to ensure that all employees are aware of how to locate procedures and respond appropriately in the event of a cyber-attack or natural disaster.

After a framework is approved, ABC should coordinate recovery team training and plan testing.

Proposed 24 Month Cycle Business Contingency Testing Plan

ABC Power’s business continuity/IT recovery plan should incorporate forensics response readiness by documenting investigative tools, data sources, and internal skill sets in order to determine preparedness levels.

Additionally, forensics can aid in business continuity with use case scenarios to identify weaknesses in controls and processes by identifying whether data sources are sufficient, accessible, and able to be analyzed by staff (Majore et. al., 2014).

More importantly, proper forensics readiness can ensure that companies can provide legally admissible evidence in the event of a crime.

ABC executives should remember to update their contingency plans within thirty days of a material change in systems or personnel change of designated role in the recovery process.

Finally, response teams should understand the contents of applicable service level agreements (SLAs). The SLA documents the agreed upon terms and conditions, responsibilities, qualities, availability, and other expectations between ABC and third party service providers involved in IT infrastructure and supply chain management (Blos et al., 2010).

Although SLAs are typically managed at the business level, the response team lead and IT management should be knowledgeable of the aspects of the SLA that could impact their ability to quickly restore services in the event of a natural disaster, breach, or other technological disruption.

Ranking and testing suggestions are evaluated next, which should incorporate any specific caveats and/or nuances relevant to the target business unit.

System Rankings:

  • Tier 1: mission critical SLA=two hour recovery time objective and fully operational
  • Tier 2: secondary with SLA= four hour restored with some degradation, twenty four full recovery

The relevancy of this tiered mechanism is two-fold.

In order to assign the appropriate level of resourcing and business/IT alignment, it is necessarily to define what is considered a “critical” system, as opposed to a system which may be business unit critical, but may not be necessarily required for the continuation of business operations.

Secondarily, by leveraging a multi-tiered model for assigning business assets, organizations can more succinctly identify the core components of their business and make better decisions on how best to use finite financial resources.

Testing and Training Schedule:

  • ABC should update snapshots of Tier 1 and Tier 2 systems at least monthly to reduce the amount of time and data needed for a full backup and restore.
  • The company should perform recovery team training through tabletop exercises quarterly for all Tier 1 assets as determined by NERC-CIP and business units.
  • ABC should perform full backup and restore tests on Tier 1 assets at least every six months. Full backup and restore testing should be performed on Tier 2 assets.
  • The company should review and update the business continuity and IT recovery plans annually based on test results, regulatory changes, and other business drivers.

 Systems Rankings Summary Sample:

Tier 1 Tier 2
SCADA Systems Non-essential File servers
Advanced Metering Infrastructure Print services
Bulk Electronic Systems Instant messenger
Access Control Systems Non-critical application servers
Payment Processing Systems Training Servers

Possible Recovery Options

ABC power should have hot and warm site options available for immediate use in the event of a disaster.

A hot site is a near exact replica of existing infrastructure, including data, networks, servers, employee work stations, and telecom services.

A warm site is a location with partially implemented services that would not require an extended amount of time to return the business to normal operations (Hatton et al., 2016).

Although maintaining hot and warm sites can be costly, this approach is highly recommended due to ABC’s role in providing critical services to the public.

Other recovery options include restoring images to the last good known snapshot of the endpoint or physical to virtual migration where feasible.

If a host is destroyed or taken offline by a malicious actor, it may be possible to perform a full restore from a snapshot of the image.

Conversely, there are virtualization service providers such as VMware that provide platform capabilities to convert or clone operating systems on physical machines to virtual machines hosted in the cloud with a single user interface to manage multiple guest hosts (VMware, n.d.).

The writer has presented the business impact analysis, business unit alignment for recovery strategies, framework, training/testing exercises, and recovery options. ABC should develop its policy and procedural maintenance process in the final planning step.

Maintenance considerations will be reviewed next.

Maintenance

Business continuity and IT contingency planning is not a one-time event. The program requires ongoing maintenance to be successful. Examples of critical activities include ongoing and regularly scheduled:

  • Risk assessments
  • Business impact analyses
  • Critical asset and process identification
  • Regulatory due diligence
  • Service provider audits
  • Identification of single points of failure
  • Review of plans, including crisis management
  • Updating of policies and procedures

Given these scenarios, ABC Power should ensure that these activities are incorporated as business as usual processes when performing capacity planning. Note that these enhancements are not free. The next section will cover cost considerations.

Cost Considerations

Many companies fail to include the total cost of ownership when implementing a business continuity plan.

ABC Power should be aware of the additional infrastructure costs associated with maintaining hot and cold sites. Personnel could be required to manage the site along with security systems, technical staff, and cyber security resources to keep the systems safe.

If the hot site is implemented, the company will have to keep spare equipment available, such as computing and telecom, to enable employees to return to work at a different location within hours of a disruption.

There are also risks associated with performing full backup and restore testing. Employee time will need to be budgeted to conduct the activities and the company could suffer loss of revenue if the backups are corrupted or otherwise unusable.

If the company decides to use the physical to virtual migration with a vendor such as VMware, there could be commercial licensing and professional technical service costs incurred. ABC may also need to budget for digital forensics, incident response, public relations and legal services in the event of a breach.

Depending on the use case, any of the scenarios above could cost millions of dollars. Therefore, ABC should ensure that money is allocated to cover unexpected expenses associated with the business continuity and IT contingency plans.

Summary of Key Findings

This paper examined IT resource contingency planning and processes at ABC. The report explored the preparation steps provided in NERC-CIP guidelines for securing power utilities.

The planning steps included a business impact analysis (BIA), business unit alignment for recovery strategies, framework, training and testing exercises, and ongoing maintenance.

The scope of the report was limited to ranking 5 critical business units: I.T., Nuclear, Customer Service, Fossil, and Regulatory Affairs with additional considerations for the legal department.

The recommended recovery measures were hot and warm sites since ABC provides critical services to the public. The student also proposed virtualization as another recovery option.

Additionally, 24-month testing recommendations were provided for assets and training.

The writer proposed monthly snapshots, quarterly tabletops exercises, semi-annual full backup and restore tests, and plan reviews annually for Tier 1 assets.

Monthly snapshots and annual full restore tests were recommended for Tier 2 assets.

Finally, cost implications for personnel, equipment, and production expenses were provided to facilitate a holistic approach to plan development.

ABC Power should work with business units to align priorities and allocate budget for successful implementation of the recommendations set forth in this report.

References:

Blos, M. F., Hui-Ming, W., & Yang, J. (2010). Analysing the external supply chain risk driver competitiveness: A risk mitigation framework and business continuity plan. Journal Of Business Continuity & Emergency Planning, 4(4), 368-374.

CIP-009-6 — Cyber Security — Recovery Plans for BES Cyber Systems (2014) North American Electricity Reliability Corporation. Retrieved from: http://www.nerc.com/pa/Stand/Prjct2014XXCrtclInfraPrtctnVr5Rvns/CIP-009-6_CLEAN_06022014.pdf

Hatton, T., Grimshaw, E., Vargo, J., & Seville, E. (2016). Lessons from disaster: Creating a business continuity plan that really works. Journal Of Business Continuity & Emergency Planning, 10(1), 84-92.

Majore, S., Yoo, H., & Shon, T. (2014). Secure and reliable electronic record management system using digital forensic technologies. Journal Of Supercomputing, 70(1), 149-165. doi:10.1007/s11227-014-1137-6

Nicoll, S. R., & Owens, R. W. (2013). Emergency Response & Business Continuity. Professional Safety, 58(9), 50-55.

Security Guidelines for the Electricity Sector: Business Processes and Operations Continuity (2011) North American Electricity Reliability Corporation. Retrieved from: http://www.nerc.com/comm/CIPC/Security%20Guidelines%20DL/Business%20Continuity%20Guideline%20Version%202%2017%20Clean.pdf

Sikdar, P. (2011). Alternate approaches to business impact analysis. Information Security Journal, 20(3), 128-134. doi:10.1080/19393555.2010.551274

Survey: Majority Of Energy IT Professionals Do Not Understand NERC CIP Version 5 Requirements (2013, November 21) Dark Reading. Retrieved from: http://www.darkreading.com/risk/survey-majority-of-energy-it-professionals-do-not-understand-nerc-cip-version-5-requirements/d/d-id/1140939?print=yes

VCenter Converter Products (n.d.) VMware. Retrieved from: http://www.vmware.com/products/converter.html

When The Lights Went Out-A Comprehensive Review of the 2015 Attacks on Ukrainian Infrastructure (2016) Booz Allen Hamilton. Retrieved from: https://www.boozallen.com/content/dam/boozallen/documents/2016/09/ukraine-report-when-the-lights-went-out.pdf

Share the love!