Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
System Landscapes
Best Practice for Solution Management
Version Date: May 2008
The newest version of this Best Practice can always be
obtained through the SAP Solution Manager
Table of contents
1
Introduction
1.1
Goal of Document
1.2
1.2.1
1.2.2
System Recovery and Business Recovery The two major steps of recovery
1.2.3
1.2.4
1.2.5
1.2.6
1.3
2
Stage 1: Initiation
9
12
2.1
Scoping Study
13
2.2
13
2.2.1
13
2.2.2
Responsibilities
15
16
19
3.1.1
19
3.1.2
19
3.1.3
20
3.1.4
21
3.1.5
22
3.2
23
3.2.1
24
3.2.2
24
3.2.3
25
3.2.4
CFIA Matrix
27
Risk Assessment
28
3.3
3.3.1
28
3.3.2
29
3.4
31
3.5
32
3.5.1
33
3.5.2
Change Management
34
3.6
34
3.6.1
34
3.6.2
35
3.6.3
37
3.6.4
38
3.6.5
40
3.7
41
3.8
41
3.9
Agree on Recommendations
42
43
4.1
Establish Organization
43
4.2
44
4.2.1
Recovery Plans
44
4.3
Crisis Management
45
4.4
45
4.5
45
4.6
46
4.6.1
Example: Extraction of a Recovery Plan for the Incomplete Recovery of an SAP CRM
System 46
4.7
50
4.7.1
50
4.7.2
51
4.8
Recovery Testing
52
4.8.1
52
4.8.2
Initial Testing
52
53
5.1
Create Awareness
53
5.2
53
5.3
53
5.4
54
5.5
54
Conclusion
2008 SAP AG
55
-2-
Appendix
56
7.1
56
7.2
56
2008 SAP AG
-3-
1 Introduction
1.1 Goal of Document
This Best Practice provides the SAP view on establishing a business continuity concept for an SAP
environment, in the style of the ITIL (the IT Infrastructure Library, see http://www.itil.co.uk) approach
for IT Service Continuity Management. It outlines a general procedure on how to set up a business
continuity concept for an SAP system landscape and SAP business processes, including the
identification of different risks and failure situations, which impact continuity of operations or data
consistency.
This document introduces a methodology to be used in a Business Continuity Management project,
with the different project phases ranging from the analysis of the business requirements to the
identification of adequate risk mitigation measures and recovery plans, including necessary
documentation and operating procedures.
The continuity requirements are determined working top-down from the requirements of the core
business processes, to the requirements of the underlying systems and technical components.
Following this methodology, customers will gain a deep insight into their SAP environment supporting
core business functions. Even though customers might already have technical solutions in place to
safeguard the operation of the technical system landscape, a Business Continuity Management
project will yield a sound definition of the requirements and possibly come up with gaps that were not
yet addressed.
The approach described in this document does not stop at a technical level but also addresses
possible risks on the application layer, like continuity of business processes, or consistency of data
objects. Resulting from a Business Continuity Management project, a customer will have documented
possible workarounds to sustain core functionality and will have created recovery plans for technical
failures and application-related logical errors.
The concept described in this document is intended to be followed in the early stages of a project, but,
if not available, a continuity concept can also be established later on, during productive operations.
Being familiar with the ITIL documentation and its general approach to IT continuity management is
not a prerequisite for working with this document, but might help in valuing the different phases we will
depict for realizing a business continuity project for an SAP environment.
The SAP Continuity Management Service can support customers with their concepts to safeguard
the continuity of business operations. This service analyzes a customers continuity concept, mainly
focusing on technical options to protect the involved systems and application data. It helps to identify
gaps and discusses options to optimize protection against business disruptions due to technical or
application failures. The service can also assist a customer in the planning phase for a business
continuity concept or with reviewing different milestones during a business continuity project. For more
information on the SAP Continuity Management Service see http://service.sap.com/continuity.
-4-
interruption to, or a reduction in, the quality of that service. While minor incidents can be handled by
the Service Desk (ITIL Incident Management), a major business disruption or disaster is an
incident that needs to be reported to the business continuity crisis team, because it could seriously
impact the availability of one or more business processes. A major business disruption that stops a
critical business process from operating may require invoking a business continuity plan.
A business continuity plan (BC plan, also called disaster plan or recovery plan) elaborates on how
to operate critical business processes on a pre-determined minimum acceptable level by using an
alternative process and on how to recover the affected business process or the affected components
back to normal operation. The decision whether an incident will be escalated to a disaster and whether
a business continuity plan will be activated is up to the business continuity crisis team. This decision
will be taken depending on the time and impact of an outage and may differ depending on the
business process being affected.
The main goal of BCM is to establish procedures in an organization that allow handling such major
business disruptions by describing alternative procedures and possible recovery methods as well as
implementing risk reduction measures and recovery technologies.
However, BCM does not end with the creation of disaster recovery plans. There is a whole lifecycle to
BCM, which needs to be established with a business continuity project. Having established business
continuity procedures, BCM needs to ensure that the continuity plans will become part of change
control management. The plans need to be updated whenever changes are applied to business
processes, be it changes to IT or changes in business operation. BCM must also introduce education
and awareness for business continuity throughout the organization and has to establish regular testing
to ensure operability of the described procedures. Each stage of the business continuity lifecycle,
which is further outlined in section 1.3, will be discussed in a separate chapter of this document.
Technical failure or disaster: This can range from crashes of individual hardware
components to building fires or flooding of an entire computer center. Technical failures affect
all business processes that are using the affected component(s).
Logical failure: Faulty software or incorrect use of software may corrupt data and provoke
data inconsistencies that cause a disruption to business processes. If, for instance, a
malicious program deployed in an ERP system corrupts master and transactional data,
executing an order-to-cash process may become impossible. Some misuse of inventory
management software may also result in a production down, as necessary goods were not
reordered on time at the factory.
A logical failure may also be the result of resolving a technical failure: Point-in-time recovery or
data loss in one system of a federated system landscape will result in data inconsistencies
between the systems that need to be addressed before resuming regular operations.
As these examples demonstrate, logical errors or data inconsistencies can have two
dimensions:
- inconsistent data within one system (for example, order data was accidentally deleted form
the database)
- inconsistent data between two systems of a system landscape (for example, orders are not
consistent between an ERP and a CRM system)
Logistical or operational failure (not in scope of this document): Apart from IT processes,
business operation depends on many operational or logistical aspects. Required staff need to
be available and facilities need to be accessible. Emergency plans for logistical aspects need
to make sure that equipment, meeting places and workspaces can be made available in case
of a disaster.
Since this document focuses on IT Service Continuity Management, logistical or operational aspects
falling into the third category will not be covered in the remainder of this document.
2008 SAP AG
-5-
1.2.2 System Recovery and Business Recovery The two major steps of
recovery
Usually, if a business process is unavailable, the availability of all involved systems needs to be
checked first. If a system is unavailable, system recovery or technical recovery has to reestablish
technical availability of the system as a first step. This can be done for example by exchanging some
defect hardware component, by activating a standby system or by restoring a database from a
backup.
In most cases, resolving the technical error will immediately return the systems and processes back to
regular operation. However, if for some reason, a method of resolving a technical error resulted in data
loss for a system component (for example when performing a point-in-time (incomplete) database
recovery or when activating an asynchronous standby solution), system recovery would leave a state
that required further analysis of data consistency between systems of a system landscape.
Business recovery or logical recovery is always required in case of logical errors or data
inconsistencies appearing either inside one system or between systems of a system landscape. With
inconsistent or outdated data, wrong business decisions might be made or inconsistencies in the
system may lead to unacceptable situations like for instance, an ERP system sending invoices without
the materials having been delivered to customers.
As described above, logical errors or inconsistencies can be the remains of a technical recovery
procedure but can also be a disaster cause of its own. In the latter case, usually only a subset of
business processes is affected because all technical components are available.
If a business process is unavailable due to a logical error (data corruption), the logical error needs to
be repaired. A BC plan should describe different ways to address data corruptions, for example by
extracting the correct data from some specially provided analysis system. Repairing logical errors
usually requires in-depth application knowledge.
Sometimes, it is considered to solve logical errors by a technical measure by recovering the affected
system to the point before that error was introduced in the system due to some user error or faulty
program (database restore followed by a point-in-time recovery). This procedure can indeed remove
the logical error from the system but, as we have seen above, due to the data loss, it introduces a new
kind of logical error affecting data consistency between the systems of a federated landscape.
Resolution of such inconsistencies again requires business recovery, now between multiple systems.
Note: The main challenge of this document will be to distinguish both types of errors, technical and
logical errors, since a business continuity plan needs to address both levels: system recovery
and business recovery.
2008 SAP AG
-6-
personnel, computer system components and software, but also power supply, or other technical
facilities such as parts of the premises of the company itself.
The business continuity project has to evaluate which business processes need protection against a
business disruption. Depending on the importance of a process, it has to establish methods to recover
the process in case of a contingency. Critical core processes requiring immediate recovery can be
protected by high availability solutions for their critical system components and by alternative
implementations of the respective business process, to ensure operation after a disruption on a
minimum level acceptable to business. Less critical processes that can be unavailable for several
hours or days without a major impact on business can be sufficiently covered by recovery plans that
use fewer resources than the recovery plans for processes with immediate severe impact.
Since the list of possible disasters or disturbances is unbelievably long, business continuity plans
should not be based on special scenarios like a fire in building X. They are created on the assumption
that some key resources are lost or unavailable, yielding useful plans that apply to several scenarios
and not only to a single scenario. Instead of preparing for very specific error scenarios, it is more
important to clearly understand and document all vital business functions in order to keep the business
running regardless of the special peculiarity of a disaster.
-7-
The following figure provides an overview of the different steps of a disaster recovery procedure to be
established by a business continuity plan:
2008 SAP AG
-8-
2008 SAP AG
-9-
Each of the following chapters of this document describes one stage of a BC project. At the beginning
of each chapter, a short summary table enumerates the personnel needed in the respective project
stage and lists the main deliverables of this stage.
In a project plan for a BC project, each of these stages resolves into a number of phases and
activities. The general course of a BC project is shown in Table 1. We will be following this structure
throughout this document and describe each of those different phases in more detail.
Section
Stage 1 Initiation
Scoping Study
Define and set strategy
Develop Project Plan
Define project phases
Define the project organization
Define project control structure
Identify initial costs
2.1
2.2
2.3
2.3
2008 SAP AG
- 10 -
3.1
3.1.1
3.1.2
3.1.3
3.1.3
3.1.4/5
3.2
3.2.1
3.2.2
3.2.3
3.2.3
3.2.4
3.3
3.3.1
3.3.2
3.4
3.5
3.5.1
3.5.2
3.6
3.6.1/2
3.6.3
3.6.4
3.6.5
3.7
3.8
3.9
Duration
Stage 3 Implementation
4.1
4.2
4.3
Create awareness
Establish education, training and exercises
Establish a continuous review and change control process
Ongoing risk evaluation and risk assessment
Establish regular testing
Establish Monitoring and Resolution of Findings
Error prevention and error detection
Clearing of inconsistencies
5.1
5.2
5.3
2008 SAP AG
- 11 -
4.4
4.5
4.6/7
4.8
4.8.1
4.8.2
5.4
5.5
2 Stage 1: Initiation
Roles
Senior IT management
BCM Project Manager
Business Process Champions
Recovery Expert from Application
Management/ Business Process Operations
(Ability to translate business recovery
requirements (application) into technical
requirements and specifications)
Output
Scoping study
BC project plan
Initial costs
Project organization and control structure
The potential business impact that could result from the realization of the risks
External pressures
The best and most effective way to raise senior management awareness is to highlight potential risks
and business impact facing an organization in terms of business failure to meet key performance
indicators or corporate objectives.
As with most IT issues, business continuity crosses organizational boundaries and consumes
management time and financial resources. Sponsorship at the highest level and integration into the IT
structure is paramount to the success of a business continuity project. Without this level of
sponsorship, risks to business continuity include:
Misalignment with the business and IT strategies, thereby failing to address the true values
and business risks as perceived by senior management
Lack of extensive co-operation and input required from management at all levels
Allow responsibilities for ongoing business continuity to be clearly defined and allocated
2008 SAP AG
- 12 -
Ensure that the organizational structure that manages business continuity during day-to-day
operation closely resembles the structure that will execute the recovery mechanisms in case
of a disaster
Ensure the business continuity strategy and requirements are integrated with the business
and IT strategies
Establishing the business continuity project team and business continuity responsibilities
2008 SAP AG
- 13 -
Goal
During the project and after its completion, the typical organizational structure for large organizations
that supports both ongoing management and invocation of business continuity procedures is outlined
in the following figure.
Figure 3: Org chart of the business continuity project / Source: Office of Government Commerce
(OGC) www.itil.co.uk
Management sponsorship at board level is often executed by a person whose responsibilities
encompass most of the organization (for example IT). Day-to-day responsibility for business continuity
often is assigned to a senior manager, who advises the board on a business continuity strategy and
ensures that these are in line with business and IT strategies. The business continuity management
together with its team of business area manager (business process champions) and the IT service
continuity managers (application management) supervises change control, testing, auditing,
awareness, and training. Steering committees at senior management level co-ordinate business
continuity activities across the organization and support the business continuity manager.
The steering committee should meet regularly to confirm the business continuity strategy is still valid
and discuss changes that could affect the strategy as well as to review programs and procedures.
2008 SAP AG
- 14 -
Management, such as application management within the IT organization, is typically given ownership
of the deliverables that relate to their area of expertise or responsibility. Ownership not only involves
responsibility for ensuring deliverables are met, but also for ensuring that they remain up to date and
fit for purpose as application management also owns the change control management standard.
Invocation of continuity mechanisms and recovery options is usually undertaken by one or several
business continuity teams, focused on specific areas of the IT organization (for example, external
communications, local area networks, servers, and so on). During periods of operational stability, the
service continuity teams play a vital role in the implementation, testing, maintenance and support of
these continuity or recovery procedures and plans.
IT may establish a working group that, typically, fills key roles in the IT recovery process and fills
operational management roles to deal with the continuity and availability management issues.
2.2.2 Responsibilities
Table 3 outlines the typical responsibilities for business continuity during times of normal operations,
as well as crisis operations. These layers of responsibility also correlate with the typical management
structure for business continuity (see org. chart in previous section).
Table 3: Responsibilities
Level
Task
Board
Senior Management
Define Scope
Management
(Business process
champion/Program
Management Office)
Manage Contracts
Develop Requirements
Negotiate Contracts
Operate Procedures
2008 SAP AG
- 15 -
Output
Scope of BC plan
Documentation of processes, system
landscape, interfaces and data exchange
Risk analysis
Impact analysis / CFIA matrix
Service level requirements for processes and
components
Recommended risk mitigation measures
Recommended recovery options
Monitoring objectives
Resource requirements
Agreed BC strategy
This stage provides the foundation to determine how well an organization will be able to handle a
business process interruption or disaster. As a result, risks to the business operation and their impact
will be well understood, which forms the basis for planning countermeasures as well as emergency
and recovery procedures.
Stage 2 consists of two main tasks:
Requirement and impact analysis, which identifies threats to continuity of services and
business processes, assesses the severity of these risks and defines the requirements as
service levels
Design of the business continuity strategy, which identifies possibilities to reduce the above
risks and determines the options to support a recovery
The following figure visualizes the main phases and the main tasks to be included in stage 2:
2008 SAP AG
- 16 -
Figure 4: The main phases and tasks of Stage 2 - Requirements analysis and strategy definition
2008 SAP AG
- 17 -
The staffing, skills, and services necessary to enable critical business processes to continue
operating at an acceptable service level for a limited time
The time within which the critical business processes should be recovered to fully operational
level
Different contingency cases, that is, which systems are most likely to fail and which business
objects are highly critical, for example, because their failure would affect a large number of
business processes
Workarounds that can reduce the criticality of a business process disruption by providing
substitute operations with usually reduced volume and functionality.
As a final step of stage 2, recommendations for risk reduction measures and recovery options need to
be agreed with all involved parties, especially from a management and cost perspective - before the
implementation stage and the creation of the BC plan can be started.
2008 SAP AG
- 18 -
The next sections will describe these different phases in more detail and provide examples for their
output using some hypothetical business processes and a corresponding system landscape.
Example:
The following gives an example list of business processes that we will be using during the course of
this document:
2008 SAP AG
- 19 -
Example:
Figure 7 shows an example production landscape of the SAP Business Suite implementation of SAP
CRM using ERP, CRM and BI. In addition to the SAP applications, a typical infrastructure often
includes several other SAP and non-SAP systems. The system landscape consists of several systems
that exchange data with each other via different interfaces. Therefore, it is important to also document
the type of interfaces. In an environment such as this, it is important that the business data that moves
between the systems is consistent and up to date.
SAP recommends using the SAP Solution Manager to document the system landscape and interfaces
for the supported business processes of the scenario in question.
2008 SAP AG
- 20 -
Example: Business object flow between CRM and ERP for Opportunity/Sales order process
The business objects involved in the opportunity and sales order processing process are the business
partners, products, product catalog, price conditions, business partner hierarchies and the sales
orders themselves. The following graph shows the data flow of these objects between the SAP CRM
and the SAP ERP system.
2008 SAP AG
- 21 -
Business partners are created and maintained in CRM and ERP and replicated between both
Business partner hierarchies, products and pricing conditions are created and maintained in
the ERP system and are replicated to CRM
Sales orders are created and maintained in CRM and are replicated to ERP
Example:
The following figure shows the aggregated business object data flow chart for the entire system
landscape resulting from section 3.1.4. The data flow to and from SAP BW is incomplete since this
was not part of the example above.
If now for instance a severe error of the database forced the IT department to perform a point-in-time
recovery of the ERP system, the aggregated business object data flow chart would show for which
business objects this would cause data inconsistencies between the ERP system and the rest of the
system components in the landscape (because the other systems had not been set back in time).
Looking at the relationship to CRM, data consistency for business partners, business partner
2008 SAP AG
- 22 -
hierarchies, pricing conditions, products and sales orders would need to be checked and
reestablished.
2008 SAP AG
- 23 -
Be of a regional or local nature (rivers with a higher risk of flooding, earthquakes, nearby
airport increasing the risk of plane crashes)
Impose a higher risk of being subject to malicious attacks, for example due to a special
company profile
Lie in accidents resulting from the companys business or nearby production facilities (for
example, explosions or fires)
For a reasonable assessment of the risks, the likelihood of occurrence should be rated for each
applicable threat. Risk mitigation in a later stage of the BC project will have to identify appropriate
countermeasures and options to reduce these risks or the impact they would have.
An additional method to identify risks is the Service Outage Analysis, which analyzes incidents which
led to service disruptions in the past and how these were handled (successfully or less successfully).
This provides an insight into threats that may still be imminent.
The potential loss that may be caused to the organization because of an interruption of critical
business processes
The form that the loss may take: lower income, higher costs, damaged reputation, immediate
and long-term loss of market share, loss of goodwill, loss of competitive advantage, and so on
The degree the loss is likely to escalate if not addressed in a timely manner
Since it is often difficult to assess losses in the amount of money lost per day or week, it may be easier
to assess the impact on a scale from 1 (very low impact) to a 10 (crucial impact that might jeopardize
your company as a whole).
Make a list of your critical core business processes and evaluate the impact if a process is not
operative for a specific period of time.
During this phase, you should also collect and document any existing service level agreements, like
Recovery Time Objectives (RTO), for these processes. The RTO should be reviewed with the
business owners at a later stage, once all surrounding facts are understood (if for example a good
workaround is available for normal process operation, the RTO need not be very small, as we will see
in section 3.3.1).
Using the information obtained in this step, the business processes can be prioritized according to
their criticality and costs of outage. The criticality of a process will be an important criterion for the
preventive measures to protect against contingencies that will be planned at a later phase.
2008 SAP AG
- 24 -
Example:
The following table provides a criticality rating for our example processes:
Process
Impact
after 4
hours
Impact
after 1 day
Impact
Impact
after 2 days after 1
week
Recovery
Time
Objective
Priority
Marketing
process
Sales
order
process
10
10
10
Reporting
process
Less
critical
Customer
Service
Process
10
10
Highly
critical
Critical
2 hours
Highly
critical
After evaluating your core business processes as outlined in the above example, you can distinguish
between processes that create an instant negative impact, like the sales order process above, and
processes that only yield a negative impact after a considerable amount of time, like the reporting
process.
Example:
The following table provides an example of useful information that has to be provided by the business
process owners for all processes in scope of the BC project.
2008 SAP AG
- 25 -
The task of each component should be described briefly and the impact of a component failure should
be considered. The impact would be less critical if a workaround is available, which should also be
noted in the table. The criticality depends on the overall importance of the business process, the
impact a failure has and the possibility of replacing normal procedures by a workaround.
The likelihood of failure and possible countermeasures can only be filled in by the business process
owners if they have special experiences for this process; in general these will be completed on
component level in a later phase by the BC project team.
Task of
Component
Criticality
red =
highly critical,
yellow =
critical,
green =
non-critical
Likelihood
of Failure
Countermeasures
Rare,
Occasional,
Frequent
Application
Systems
CRM
RTO is 2h
Highly critical
Workaround: Orders
can be entered
directly in ERP;
Order entry volume
will be 20% of normal
order volume;
Opportunity
processing is not
possible, will be
critical after 4 hours ,
highly critical after 1
day
ERP
Highly critical
Rare
Telephone
Highly critical
Rare
Power
Highly critical
LAN
Highly critical
WAN
Highly critical
Order processing is
not possible, will be
critical after 4h
Infrastructure
Business
objects
Business
partner
Products
2008 SAP AG
Provide
information
for customer
contact,
provide
information
for delivery
Critical
Highly critical
- 26 -
Frequent
HA solution
implemented
Order could be
created without final
prices, corrected
prices could be
provided later on
Critical
Sales order
consistency of old
sales orders is
irrelevant for creating
a new one
Non-critical
Example:
Our example lists all processes as columns and all components as rows of the CFIA matrix. We
ordered the business processes with decreasing priority from left to right, according to the criticality
that was determined in section 3.2.2. Again, we include the main data objects in the matrix because
they are important for data consistency considerations and logical recovery on object level.
Each field of the matrix depicts the criticality of a component for a specific business process. The
criticality is noted according to the following schema:
Red:
A field is left blank if a failure of the component or an inconsistency of the object does not
impact the process in any way.
You could note the availability of a workaround, for example, by adding a W to the field of the matrix.
This would mean that the original criticality was reduced due to the availability of this workaround.
The criticality of the components being used by a business process cannot be higher than the
criticality of the process itself. Therefore, the maximum criticality assigned in a column can be that of
the business process.
The overall criticality of a component is determined by the maximum criticality given for this
component over all columns.
Looking at the rows of the matrix, you can identify the criticality of components and objects that are
frequently used in business processes. The matrix also shows which processes are most vulnerable to
component failures or object inconsistencies by considering the amount of entries in the respective
column.
2008 SAP AG
- 27 -
Overall
criticality of
component
# of
process
es
Sales order
process
Service
process
Marketing
process
Reporting
process
Criticality of
process
Highly
critical
Highly
critical
Critical
Less critical
# of components
# of data objects
6
4
5
2
4
2
4
4
Components
Application
Systems
CRM
ERP
Telephone
Power
LAN
WAN
Business
partner
Products
Pricing
conditions
Sales order
Infrastructure
Business
objects
- 28 -
In this step, roughly outline such alternative business processes. The details will be elaborated on in
the implementation stage (see chapter 4). For each alternative, describe the staffing, skills, services
and procedures that are needed to operate the workaround.
If no workarounds are available, business only has to provide the RTO after which operation of the
business process has to be fully recovered.
Example:
As an alternative process for the opportunity/sales order process, the ERP system might be used to
enter sales orders in the business process, without using the CRM system if it is unavailable due to a
system failure. The details of the implementation of the alternative process have to be elaborated in
stage 3 (see section 4.5). In this example, the availability of an alternative processing for the sales
order process relaxes the support requirement from 2 hours (as previously given in Table 4) to a full
day. Since this workaround only substitutes a failure of the CRM system, we need to distinguish
between components. The RTO may thus differ for a specific process depending on the component
that is unavailable.
Criticality
(from
3.2.2)
Workaroun
d exists for
failure of
RTO for
minimum
recovery
RTO for
full
recovery
Sales Order
Process
Highly
critical
CRM
2 hours
1 day
Sales Order
Process
Highly
critical
n/a
n/a
4 hours
Customer
Service
Process
Highly
critical
n/a
n/a
6 hours
Marketing
Process
Critical
n/a
n/a
2 days
Reporting
Process
Less critical
n/a
n/a
1 week
- 29 -
This aspect can require an adaptation of the values determined solely on the RTO of the involved
business processes.
Besides RTO, the Recovery Point Objective (RPO) constitutes an important criterion for recovery of a
component. RPO defines the acceptable data loss during recovery of a component, for example, when
restoring from a backup or when switching to a standby site. In order to ensure data consistency in a
federated system landscape, an RPO of 0 is required. If the RPO is more than 0, technical recovery of
a component will leave the need to analyze and resolve remaining data inconsistencies before
business operations can continue. This means that the recovery time can be considerably longer (also
see 3.6.2).
The RPO must be determined by the business process owners, taking into account the impact of
possible data loss.
Since again, we include business objects in the list of components (see table below), we need to
define what we understand as RTO and RPO of a business object. If for example business objects of
type Products became corrupted due to some software bug, the RTO for Products would define how
long it may take until objects of type Products are available again for use by the core business
processes. In addition, the RPO of a business object would define the amount of tolerable data loss in
case of a contingency. The minimum RPO of all business objects maintained by a component should
equal the overall RPO of that component.
Note: When determining RTO and RPO in this section, it is important to note that the costs to achieve
them are not yet considered. So, when discussing the (technical) solutions to achieve these
goals in sections 3.4 and 3.5, the results obtained in this phase may need to be revised later
on. If for example the costs of the solutions become prohibitive, the business may decide to
review the service level agreements based on a cost/benefit analysis.
Example:
The following table lists the criticality and RTO/RPO for our example components. These result from
the RTO requirements of the business processes. The RPO of an application component can be
determined by assessing the RPO required for all business objects maintained by this component (this
information is contained for example in the documentation created in section 3.1.4)
For the sales order process, the required RTO for CRM relaxes to 1 day due to the workaround
described in 3.3.1. Since the workaround relies on the availability of the ERP system, the RTO for
ERP remains at 2 hours.
Since Products are required for the most important Sales Order Process, their RTO is determined by
the RTO of this business process (2 hours). This notion would underline the importance of the
business object Product and show that some advance considerations should be made into possibilities
of re-establishing consistency of Products if corrupted for example by an incomplete recovery.
As can be seen from the data flow charts in section 3.1.4, Products are held in ERP and in CRM. So,
in case of corruptions or inconsistencies in one system, it might be possible to recover Products from
the other still operative system (although due to pending (queued) objects that were not yet
transferred between the systems, some objects might not be recovered completely this way).
Therefore, in a business continuity strategy it is important to ensure that the replication of Products
and other objects between systems is working properly to have the current data available in one of the
systems for recovery in case of a contingency in the other system.
The columns Current RTO and Current RPO can be used to contrast the required RTO/RPO with
the currently committed RTO/RPO from IT. If the latter one is higher than the requirements, this
indicates that new solutions will be needed to achieve the requirements or that the requirements
identified so far need to be revised.
2008 SAP AG
- 30 -
Criticality
(from 3.2.4)
Likelihood
of failure
Required
RTO
Required
RPO
ERP
Very high
Low
2 hours
~0
CRM
Very high
Low
6 hours *
~0
Telephone
Very high
Low
2 hours
n/a
Power
Very high
Medium
2 hours
n/a
LAN
Very high
Low
n/a
WAN
Very high
High
n/a
Current RTO
Current RPO
Application
Systems
Infrastructure
Business
objects
Business
partner
Very high
6 hours
1 hour
Products
Very high
2 hours
1 hour
Pricing
conditions
High
1 day
1 hour
Sales order
Low **
1 day
~0
* determined by service process since criticality for sales order process was relaxed to 1 day in table 7
** only resulting from this example, which does not regard the usually very critical order to cash process
Risk Mitigation
Recovery
Costs of the risk mitigation measures (hardware, cluster solutions, and so on) , including
maintenance and testing of the solution
2008 SAP AG
- 31 -
and the
Costs of unavailability (including various aspects as described in section 3.2.2) plus costs of
recovery measures to re-establish operations
If the latter costs are lower, risk mitigation may not be appropriate and recovery options might be
sufficient. However, as a general rule to avoid business disruption, risk mitigation should be preferred
over the invocation of recovery mechanisms.
Recovery options are available on various levels which distinguish primarily by the possible speed of
recovery. Determining the appropriate solution also depends mainly on costs by comparing the:
Costs of the respective recovery solution (standby arrangement, replication, backup and
restore, tape shipping, etc.)
and the
Costs of unavailability through longer recovery time (including various aspects as described in
section 3.2.2) plus costs of recovery to re-establish operations through simpler recovery
measures
The following diagram, which applies similarly to costs of risk mitigation measures as well as costs of
recovery measures, illustrates this dilemma to identify the adequate technical solutions. Since
technical measures to reduce risks or to reduce the duration of recovery will become more expensive,
the higher the level of protection that can be achieved, it is necessary to balance the costs of these
measures versus the costs of disruption and recovery.
Ideally, the chosen strategy meets the intersection of the two lines; the intersection marks the
maximum allowable outage time.
Note: If the analysis should show that the ideal solution that is chosen for business continuity cannot
reach the previously determined service levels from section 3.3, these need to be revised and
adapted in a new round with the business process owners.
In addition to risk mitigation and recovery options, monitoring constitutes an additional building block
for business continuity. Monitoring has to ensure that any disturbances or anomalies will be detected
as early as possible and countermeasures can be initiated before they escalate into major business
disruptions.
- 32 -
Use of multiple service providers or establishing an alternate service provider for critical
external services
RAID protection
Redundant middleware components (multiple load balancers, multiple web servers, and so
on)
Failover cluster solutions for database system and SAP central instance / SAP central
services
More information on HA solutions for SAP is available in the SAP Service Marketplace at
http://service.sap.com/ha.
The adequateness of technical solutions to eliminate SPOFs must be determined based on costs of
the solutions, likelihood of the failure and impact of the failure / required service levels. As discussed
above, the costs for risk mitigation should (in general) not be significantly higher than the costs
induced by the risk.
Example:
In the previous sections, the components supporting the most critical sales order process have been
identified. These components are candidates for protection through HA solutions. Even though a
restricted workaround is available in case of an outage of CRM, investing in an HA solution can be
reasonable because a noticeable impact is already perceived after 4 hours due to the inability to
process opportunities. Therefore, the CRM and ERP database and central instance / central services
will be protected by a cluster solution and multiple application servers will be provided for CRM and
ERP for redundancy. The network connections should be redundant, a stand-by power supply should
be available and telephony needs to be secured. See Table 10 for an overview.
2008 SAP AG
- 33 -
Do nothing
If the contingency does not have a business impact, it might be possible that doing nothing is an
option and no recovery is needed.
Example:
(Technical failure): A test system fails that was scheduled to be reinstalled next week. Operation
without the test system is acceptable as it will be available again in a week and no major tests have to
be done during this week.
(Logical failure): A new program corrupted data. The analysis shows that only historical data was
affected. Since all involved processing has already taken place, it is decided that a recovery of this
corrupted data will not be needed.
Manual correction
If the contingency is of a small scale but has some impact on the business process, a manual
correction could be that data is manually corrected or a small report corrects inconsistencies in
business objects. This measure is contrasted to a major disruption of a process that needs a more
sophisticated recovery procedure.
Example:
(Technical failure): A system becomes unavailable for a short period of time while a data upload via a
file interface is in progress. Since the status of the upload is in an unknown state, manual intervention
and restart of the upload is required. Application management identifies the affected objects and
resends them to the system manually.
(Logical failure): End users created sales orders with incorrect tax classification. The impact is that
certain sales orders cannot be processed. The application management team identifies the sales
orders and corrects the tax classification after discussion with the end users manually.
Gradual recovery
Gradual recovery or 'cold standby' is applicable if no immediate recovery of the business process is
needed and the organization can operate for up to 72 hours (according to ITIL), or longer, without a reestablishment of the full business process on the respective system components. When considering
cold standby, the necessary hardware is either already provided at a disaster recovery site or it must
at least be ensured that the necessary hardware can be obtained in time to rebuild the systems.
Example:
(Technical failure): The storage hardware of the reporting system faces a problem. The system must
be restored. As the reporting process is not critical, no spare hardware is available. Due to an
agreement with the storage provider, replacement hardware will be delivered and installed within 2
days. Restore and database recovery of the system will be finished within another 2 days.
Intermediate Recovery
2008 SAP AG
- 34 -
Intermediate recovery or 'warm standby' is necessary if you want to reestablish your business process
within 24 to 72 hours (according to ITIL). This involves at least having spare hardware available at a
remote site, either company-owned or provided as a recovery service. This may also include the
creation of a daily mirror of the production data at the remote site. To become operational, this mirror
only needs database forward recovery for the logs created since then.
Example:
(Technical failure): The marketing process of our examples above can be unavailable for more than 1
day without a severe impact on business but the service process should not be unavailable for more
than 2 days. To ensure a recovery after 24 hours the systems and database files of the affected
systems are mirrored to a remote site. To make the systems available at the remote site, manual
activities have to be performed for database recovery and system restart.
(Logical failure): A database table with business partners is dropped. The corresponding tablespace is
restored to an alternate hardware; the lost data is exported from this analysis system and then
imported back into the production system. The whole procedure requires 24 hours until the object
business partners is accessible again.
Immediate Recovery
Immediate recovery or 'hot standby' provides for immediate restoration of services/processes and is
usually provided as an extension to the intermediate recovery provided. The immediate recovery is
supported by the recovery of critical business functions and support areas during the first 24 hours
(according to ITIL) following a service disruption. However, nowadays, recovery demands even lie in
the range of few hours and less. For components that require immediate recovery, the impact of loss
of service has an immediate impact on the organization's ability to make money, such as the sales
order process in our previous examples.
Example:
(Technical failure): The sales order process in the examples of this chapter has a severe impact on
business if the ERP system is unavailable for more than 4 hours. To ensure that system operation can
be recovered in less than four hours, a standby database at a remote site is continuously recovered
with logs from the production site. A switch to the standby database would be performed for example
in case of a severe storage system failure when the implemented high-availability solution is not
applicable. The standby database would also be activated if database block corruptions made the
primary database unusable.
When you have decided which recovery option you want to use for which business process disruption
scenario or component failure scenario, you need to detail the applicable recovery method for
recovering a system component or for recovering business objects.
- 35 -
also be caused by a switchover to a disaster recovery site that is not synchronously replicated and
cannot be subject to a complete database forward recovery.
When the technical recovery has finished, business recovery has to follow to identify inconsistencies
between this system and the other systems of the landscape. The overall recovery time until the
system and dependant business processes are fully operational is given as the sum of the technical
recovery and the logical recovery.
Scenario 3: Logical Failure is Corrected in the Affected System
A logical failure corrupts some business objects in system 1. The corrupted objects are identified and
repaired directly in this system. The affected business process is operational after this logical recovery
has finished.
Scenario 4: Logical Failure is Corrected Through Point-in-time Recovery of the Affected
System
A logical failure corrupts some business objects in system 1. To avoid the effort of repairing the corrupt
objects directly in the system, a database point-in-time recovery (incomplete recovery) is performed to
the point before that error occurred. After this technical recovery method is finished, the affected
system itself is in a consistent state. However, due to the data loss caused by this operation, data
consistency in relation to the other systems of the landscape is no longer maintained.
The overall recovery time is increased by the time it takes to repair these inconsistencies on
application level. The affected business process is only fully operational after this logical recovery has
finished. Additionally, many other business processes will be affected since their data also became
inconsistent. Data consistency for many other systems not shown in this figure may be affected as
well.
These scenarios show that the RTO of a business process can be determined by different phases of a
recovery until the business process is fully operational. The actual recovery time for a process is given
2008 SAP AG
- 36 -
by the RTO of the involved components plus the recovery time to re-establish data consistency, if
required.
RPO
Gradual /
Intermediate
0 **
12 72 hours
Gradual /
Intermediate
48 - 168 hours
Standby database
(asynchronous log shipping)
Intermediate /
Immediate
1 8 hours
Intermediate /
Immediate
4 24 hours
Asynchronous replication
Immediate
30 min* - 8 hours
0 5 minutes
Synchronous replication
Immediate
5 min* - 8 hours
0 ***
24 168 hours
10 min 24 hours
4 24 hours
** complete recovery
Example:
The following table gives an example of possible risk mitigation and recovery options for the example
components depicted for our business processes:
2008 SAP AG
- 37 -
Table 10: Example Recovery Options for System Components of Example Processes
Component
Recovery
Class
Technical Recovery
Method
Immediate
Synchronous replication to
remote site, standby
hardware readily available
Appl. Systems
ERP Database
Immediate
ERP Appserver
Immediate
CRM Database
Immediate
Synchronous replication to
remote site, standby
hardware readily available
Daily full backup allowing
restore within 12 hours
(including log recovery for 1
day)
CRM Central
instance
Immediate
CRM Appserver
Immediate
Infrastructure
Telephone
Immediate
Power
Immediate
LAN
Immediate
Redundant connections,
redundant routers and
switches
WAN
Immediate
Alternative arrangement
using satellite connection
2008 SAP AG
- 38 -
Note: A point-in-time recovery of a database should not be used as a recovery option for logical errors
inside a single system, since this induces new problems with data inconsistencies between the
systems of a landscape and to the real world.
Discuss the recovery options per business object and identify the recovery strategies for possible
partial or complete loss of business objects.
More information on consistency checks and consistency check tools provided by SAP can be found
in the Best Practice Data Consistency Monitoring within SAP Logistics that will be available at
http://service.sap.com/solutionmanagerbp.
Section 4.6.1 will provide a detailed example of how recovery options for a business object can be
worked out.
Note: To detect inconsistencies as early as possible, and before they will result in a possible business
disruption, data consistency monitoring should be established (see 3.7).
Example:
The decision is made that only business objects required for the highly critical sales order process
shall be subject to detailed recovery planning. The basic strategy is outlined here while further details
will be worked out in the implementation stage.
Table 11: Typical Recovery Options for Business Objects of the Sales Order Process
Business object
Business partner
Product
Pricing conditions
Sales order
2008 SAP AG
- 39 -
Paper-based
Example:
For the sales order process, it might be possible to implement the following alternative business
process on a smaller scale. It is possible to replicate opportunities with a customer program as special
quotations to the ERP system (it has to be considered if custom coding is worth the effort which
depends on the criticality of the business process). The call center agents could access the ERP
system, find the opportunities as quotations of a special transaction type and negotiate orders with
customers because products, pricing, business partner hierarchies and the configurator are also
available in the ERP system. It is not possible, however, to access the product catalog and use the
customer fact sheet. This gives less value to the customer, but orders could be negotiated on a
smaller scale. A requirement to establish this workaround would be to enable the replication of
opportunities to ERP and to evaluate and develop reports that enable a stand-by processing of
opportunities and orders without product catalog and fact sheet. As well training of the call center
agents for the ERP environment needs to be planned.
The following table summarizes the requirements for additional workarounds as determined in this
step.
2008 SAP AG
- 40 -
Recovery
class
Workaround
required for
Description
Sales Order
Process
Immediate
Failure of CRM
Customer
Service
Process
Immediate
n/a
Marketing
Process
Intermediate
n/a
Reporting
Process
Gradual /
Manual
correction
n/a
Batch monitoring
Performance Monitoring
Consistency monitoring is supported by the Data Consistency Cockpit in SAP Solution Manager
and will throw alerts when an anomaly is detected.
2008 SAP AG
- 41 -
2008 SAP AG
- 42 -
Output
Implementation plan
Organizational structure supporting BCM
BC master plan
Crisis management and escalation
procedures that may invoke BC plan
Detailed recovery plans and procedures
Documentation of risk reduction measures
and standby arrangements
Test plan
Initial test of continuity concepts
During the implementation phase of a business continuity project, the solutions and recovery options
determined and agreed in stage 2 will be implemented and elaborated in detail. Besides the
implementation, the documentation of the solutions and procedures plays an equally important role.
Following the ITIL standard, the third stage of the project comprises the following phases:
Organization:
Establish the organization that is responsible for business continuity management
Implementation planning:
Develop implementation plans that describe the structure of the BC plan and assign work
packages
Develop procedures:
Lay down general and detailed procedures for different recovery tasks
Initial testing:
Create test plans and perform initial tests of the BC concept
The business continuity manager, who ensures that continuity plans are up-to-date and tested
A crisis team of senior managers from business (business process champion) and IT
(application management) that decide if disaster recovery plans need to be executed
The recovery team with representatives from business (key business user/business process
champion) and IT (application management/SAP technical operations/business process
operations). The recovery team should also be staffed from key and end users that ensures a
minimum operation of the business process for example working on paper or other
workarounds.
SLAs recording the availability agreements for involved departments and partners have to be defined.
2008 SAP AG
- 43 -
The results of the analysis phase conducted during the BC project will be included in the respective
plans.
Appendix 7.2 lists the pieces of information that should be part of a BC plan.
2008 SAP AG
- 44 -
Besides focusing on system components and technical solutions to eliminate SPOFs, the change
management process should also be subject to verification or revision, since this is the only chance to
avoid logical errors or data corruptions.
For intended business process workarounds, all details and requirements of the alternative process
need to be worked out in this stage. Coming back to the example of section 3.6.5, the workaround
described there requires the implementation of a customer specific report that replicates opportunities
to ERP (as quotations with a special type).
The documentation of standby arrangements should include:
2008 SAP AG
- 45 -
For a business process workaround: What resources are required? What data is
needed, where does it come from?
Fallback requirements and procedures what must be done to return to normal operation
after the original systems are available again?
For a technical solution like switchover to a DR site, describe how to switch back
operations to the primary site
For a business process workaround, describe how the data created by the
workaround will be incorporated back into the standard process and systems and
what follow-up activities are required to complete the missing parts that were not
covered by the workaround
- 46 -
Procedure:
(1) Stop business operation in CRM
(2) Perform point-in-time recovery of the database of the CRM system (do not start CRM SAP system
before next step)
(3) Isolate CRM system (disable communication with BW and ERP)
Stop outbound queue processing from ERP to CRM: Deregister outbound queues in ERP
(transaction SMQS) *
Disable outbound queue processing from CRM to ERP: Lock respective RFC-user in ERP
(transaction SU01)
Disable data requests from BW to ERP (for example, from transaction RSA1 in BW or from
process flow control, also from SM37)
Disable synchronous data transfers from CRM to BW by locking the corresponding RFC-user
in BW
Stop productive operation in CRM by locking all regular users (Transaction SU10)
For CRM Field scenarios only: Stop CRM replication&realignment queues (transaction
SMOHQUEUE)
Disable outbound queue processing from CRM to ERP: Deregister outbound queues in CRM
(transaction SMQS) *
Disable outbound queue processing from ERP to CRM: Lock respective RFC-user in CRM
(transaction SU01)
Unschedule the BDOC reorganization job MW_REORG in CRM to avoid deletion of BDOC
message store
* Deregistering outbound queues may not be sufficient in all cases. Some applications may register queues
automatically even if queues are deregistered. To be safe, RFC destinations can be disabled instead, for
example by pointing them to an invalid server (transaction SM59).
If the point to which the CRM system is set back lies only a very short period ahead of the
point of the crash
In such cases, the contents of these RFCs could be cross-checked against the ERP data to find
out whether the RFCs were processed or not.
Queue entries in ERP:
Pending RFC queue entries in the ERP system may not be deleted because they contain the most
current data objects, which need to be processed. They should be processed later, after business
recovery between R/3 and CRM is completed, because they may rely on CRM data, which was
lost due to the incomplete recovery. Because the CRM system was isolated above, unintentional
processing of these queues is prevented.
2008 SAP AG
- 47 -
New RFC queue entries in ERP, which are created during the period of business recovery,
may not be deleted.
CRM data, which is created for example by re-entering lost data may cause, double posting in
other systems. This has to be considered during business recovery.
In this example we will only show the case of business partners, but in a real plan recovery
procedures must be described for all business objects.
a. Business Partners:
Leading systems can be ERP and CRM, so we have to regard objects transferred in both
directions. The following figure displays the types of inconsistencies that may appear for business
partners after an incomplete recovery of CRM. As both systems are leading systems, our goal will
be to transfer all inconsistent objects from ERP to CRM because ERP has the newer version. This
will also recover objects that were created or changed in CRM.
The CRM Data Integrity Manager (DIMa) can be used to compare business partners at header
level (check for existence) or on detail level (check for different field content). Based on the
comparison result, you can either request a business partner from ERP into CRM again, or you
can send a business partner from CRM to ERP. Please see SAP note 647664 for more details on
the DIMa features for comparing ERP customer masters with CRM business partners.
This procedure will resolve all business partner inconsistencies between CRM and ERP except for
deletions (cases 3 and 6, see below). It will re-transfer to CRM all business partners created or
changed in ERP. In CRM, this will also recreate all business partners that were created or
changed in CRM and already replicated to ERP. Only business partners that were not yet
transferred to ERP cannot be recovered this way.
Note: Please pay attention to the fact that the data models of the ERP customer master and the
CRM business partner are different. This includes attributes that are available in ERP or
CRM exclusively. Even if one system is considered as master system, where a new partner
is preferably created, there can be subsequent updates in the other system to add further
attributes.
2008 SAP AG
- 48 -
Example: A new customer master is created in ERP and automatically loaded to CRM.
Afterwards, CRM is used to add special marketing attributes to this customer, which cannot
be maintained in ERP.
In consequence this also means that such exclusive attributes cannot be recovered from
another system.
For identifying inconsistencies due to deletions (Cases 3 and 6) the DIMa tool can be used as
well. We assume that there are no direct physical deletions of business partners. Instead, there is
first a logical deletion (setting the deletion flag) and in periodic intervals archiving runs do the
physical deletion. The deletion flag is replicated between ERP and CRM. Now, if the business
partner was already physically deleted in one system, DIMa would report it as missing in the other
system. This unwanted effect can be solved by using the deletion flag (field LOEVM) as additional
filter for the DIMa comparison.
DIMa does not allow restricting the comparison to a specific period of time (only on the creation
date of a business partner).
Note: There can be cases where pending queue entries contain updates that were missing in
CRM. When performing a DIMa comparison, such temporary differences would be reported
as inconsistencies as well. It can make sense to have multiple iterations of DIMa
comparisons to consider data that was stored in queues but was processed later on
successfully.
Note: The time for a DIMa comparison can be quite long, comparable to a full initial download from
ERP to CRM. If you fear that there is large number of missing or inconsistent objects, it
should be considered whether it makes sense to perform a real initial download instead. In
such a case, the pending ERP outbound queue entries can be omitted.
Possible alternative to using DIMa for business partners:
By means of change documents for business partner records in ERP, all business partners,
which were modified since the recovery point-in-time can be selected in ERP and then
transferred to CRM. Business Partner records that are created or modified in ERP after the
crash point-in-time may not be included because they will be transferred to CRM using regular
replication mechanisms (objects still in the ERP queues should also be excluded). Change
documents are activated in ERP for all business partner fields that are relevant for CRM.
1. In ERP, select all business partners changed between the recovery point in time and the
crash point in time.
VD03 Environment, Field changes Environment, Multiple display can be used to
display all business partners changed since that day (selection not possible by time)
Note: A special report may be needed for that purpose if this function is not sufficient.
2. Download all these business partners from ERP to CRM. To avoid duplicates, only
missing objects may be transferred.
Note: A special report is needed to automate business partner download for the objects
identified above. This report can call the standard functionality (transaction R3AS4) to
synchronize missing changes and/or business partner. This report must pay attention to
objects still available in the queues, these may not be re-transferred.
In CRM, you can use the report BUSCHDOC to perform a mass evaluation of change
documents for several business partners. Therefore, you can identify all business partners
that have been changed within a given timeframe.
Specifically for business partners, there is also a special transaction called
CRMM_BUPA_SEND, to manually trigger the sending of CRM business partners to the
receiving systems like a connected ERP backend. With transaction CRMM_BUPA_MAP you
can even trigger a new request load from ERP to CRM for a single business partner.
b. Other Business Objects
For all other business objects that are exchanged between CRM and ERP, procedures to check
and re-establish data integrity are required as well. For the common exchange objects between
ERP and CRM, corresponding DIMa objects are available. Please see transaction SDIMA_BASIC
for a list of DIMa objects with their offered repair options.
2008 SAP AG
- 49 -
Check if transports were imported into the CRM system during the period that was lost due to
the incomplete recovery
(8) Checks
Execute functional checks of CRM business processes
(9) Restart productive operation in CRM
Now that data consistency has been re-established and all functionality checks were successful,
the CRM system can return back to productive operation. The communication to other systems
can be enabled and users can be released to work with the usual business processes in CRM.
2008 SAP AG
- 50 -
Note: Temporary differences can be caused for example if a queue used for data exchange is
stopped. Since the data is not transferred, the end user perceives this as a data inconsistency.
However, it is not a real inconsistency requiring corrective tools because the situation can be
easily resolved by processing the pending messages.
4. Analyze if it is a technical or logical inconsistency
Note: A Technical Inconsistency is everything that can be found on a database level and
needs appropriate correction in the underlying database, while a Logical Inconsistency is a not
disappearing mismatch that is due to a misunderstanding of process or misinterpretation of
data. While technical inconsistencies can be identified by technical means like check reports,
logical inconsistencies need to be identified by mapping the intended business process to the
underlying data structures. They cannot be identified by technical means as the underlying
data is consistent on a one-to-one level.
5. Decide if productive use can continue or needs to be interrupted
6. Identify root cause (programming error, non-transactional interface, incorrect error handling,
incorrect data entry, no clear leading system, )
7. Correct root cause
8. Analyze if dependent data is affected (in the same system, in other systems or in the real
world)
9. Identify inconsistencies, filter out differences
10. Correct inconsistencies
11. Correct dependent data
A tool is needed to compare business objects between ERP and CRM. The comparison must
identify completely missing objects and objects that have different content due to missing
updates. For common business objects exchanged between ERP and CRM, the Data Integrity
Manager (DIMa) can be used.
The IT support departments must implement such special reports needed during disaster recovery.
2008 SAP AG
- 51 -
2008 SAP AG
- 52 -
Output
Change control procedure for BC plan
Training plan
Test schedule
Monitoring schedule
After having carried out the previous phases, it is important to install the continuity plan in day to day
business operation. It is very important to verify that after changes in business processes or IT
infrastructure, the continuity plan is kept up-to-date. This section describes the main tasks that are
important from an operational point-of-view.
Business processes
IT infrastructure
For maintenance of the continuity plan, it is essential to establish a change control procedure and
clear responsibilities.
It is good practice to include business continuity as a topic into every implementation project that
changes business processes or IT processes. This way, possible changes to the continuity plan can
be identified and implemented more easily and more timely.
2008 SAP AG
- 53 -
2008 SAP AG
- 54 -
6 Conclusion
Some people say that, for Business Continuity Management,
The planning process is more important than the plan itself.
The truth behind this statement lies in the fact that the lessons learned and the experiences gained
during the planning process are an important side-effect of a business continuity project. The process
will often reveal a multitude of findings and insights into weaknesses of current concepts and
procedures. Addressing previously unknown issues and risks can already yield a higher availability.
But of course, the business continuity plan itself is vital for establishing continuity procedures and
spreading awareness throughout an organization.
To ensure that a business continuity concept is successfully implemented and lived by an
organization, it is necessary that awareness and commitment is created at the senior management
level. The continuity project has to have an adequate priority and a sufficient sponsorship in order to
have acceptance and commitment of managers and staff.
This document outlined the initiation, the business impact analysis, the determination of recovery
options and the creation of a detailed recovery plan, including initial testing and the tasks necessary in
operational management to make the business continuity effort a part of daily operation.
A last word on operational management to ensure the effectiveness of disaster recovery plans:
To continuously keep the business continuity plan up-to-date, because otherwise it is not effective,
staff and management have to be continually committed to the continuity effort within the organization.
Everybody has to be aware of his responsibility for business continuity, so that triggering required
changes to the continuity plan will become a matter of course whenever a change of processes or IT
infrastructure is implemented. To maintain the quality of the continuity plans, management has to
control and monitor activities in the area of continuity management.
2008 SAP AG
- 55 -
7 Appendix
7.1 Template for scoping presentation
Presenter: Business management/IT management
Audience: Senior management
Structure:
Introduction
Case studies:
Disaster Case 1
Impact of disaster case 1
Resulting loss without recovery plan
Resulting loss with recovery plan
Disaster Case 2
Impact of disaster case 2
Resulting loss without recovery plan
Resulting loss with recovery plan
Costs and Resources involved in a business continuity effort
Costs of loss without recovery plans against
Costs with recovery plan + recovery project costs
Conclusion
Business processes
Required SLAs
DR team,
decision process,
damage assessment
2008 SAP AG
- 56 -
Prerequisites
Checks
Testing
2008 SAP AG
- 57 -
Microsoft , WINDOWS , NT , EXCEL , Word , PowerPoint and SQL Server are registered trademarks of
Microsoft Corporation.
IBM , DB2 , OS/2 , DB2/6000 , Parallel Sysplex , MVS/ESA , RS/6000 , AIX , S/390 , AS/400 , OS/390 , and
TM
INFORMIX -OnLine for SAP and Informix Dynamic Server are registered trademarks of Informix Software Incorporated.
UNIX , X/Open , OSF/1 , and Motif are registered trademarks of the Open Group.
HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W3C , World Wide Web Consortium, Massachusetts
Institute of Technology.
JAVA is a registered trademark of Sun Microsystems, Inc. JAVASCRIPT is a registered trademark of Sun Microsystems, Inc., used
under license for technology invented and implemented by Netscape.
SAP, SAP Logo, R/2, RIVA, R/3, ABAP, SAP ArchiveLink, SAP Business Workflow, WebFlow, SAP EarlyWatch, BAPI, SAPPHIRE,
Management Cockpit, mySAP.com Logo and mySAP.com are trademarks or registered trademarks of SAP AG in Germany and in
several other countries all over the world. All other products mentioned are trademarks or registered trademarks of their respective
companies.
Disclaimer: SAP AG assumes no responsibility for errors or omissions in these materials. These materials are provided as is
without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability,
fitness for a particular purpose, or non-infringement.
SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages that
may result from the use of these materials. SAP does not warrant the accuracy or completeness of the information, text, graphics,
links or other items contained within these materials. SAP has no control over the information that you may access through the use
of hot links contained in these materials and does not endorse your use of third party Web pages nor provide any warranty
whatsoever relating to third party Web pages.
2008 SAP AG
- 58 -