Sei sulla pagina 1di 21

SLA, Disaster Recovery, BCP and

Risk Assessment
Session 14
Service Level Agreement(SLA)
• Describes the level of service expected by a customer from a
supplier

• Contract between a service provider and its internal or


external customers that documents what services the provider
will furnish

• Vendor and the customer can use the SLA to agree on the
methods of rendering a guaranteed service

• Lays out the metrics by which that service is measured, and


the remedies or penalties, if any, should the agreed-upon if
levels not be achieved
• The SLAs define the attributes (such as content, scope, and means) for the
service products (such as maintenance or hotline) agreed on with the
customer in the service contract

• SLAs are between companies and external suppliers, but they may also be
between two departments within a company

• May include a plan for addressing downtime and documentation for how
the service provider will compensate customers in the event of a contract
breach.

• SLAs, once established, should be periodically reviewed and updated to


reflect changes in technology and the impact of any new regulatory
directives

• Product support : bug fixing, new versions, releases


Metrics of the SLA
• Response Time
– How long it takes to respond to the customer need- call back within a
specified time, technician on site within a specified time

• Service Window or Availability Time


– Working hours of service or support center

• Defect rates
– Counts or percentages of errors in major deliverables

• Downtime
– Maximum number of allowed breakdowns per year
• Availability
– Percentage of Assured system availability

• Solution Time
– Maximum period of time allowed for the solution of the problem

• Technical quality
– Measurement of technical quality by commercial analysis tools that examine
factors such as program size and coding defects.

• Security
– Measuring controllable security measures such as anti-virus updates and patching
is key in proving all reasonable preventive measures were taken, in the event of an
incident.
Definition Reaction time Standard Service
Resolution Time Level Yearly
Target

Any business, technical ,or facility 1 hour 8 hours 97%


outrage where service must be restored (irrespective of
with urgency or deadlines will be missed office hours)
Completer non-availability or non- 1 hour 8 hours 97%
usability of an application (irrespective of
office hours)
Employees can continue to perform 2 hours 24 hours 97%
workaround or delay work with some
impact on internal and external
customers
Employees can continue to perform 4 hours within 3 days working or 95%
workaround or delay work with no desk support on agreed due
immediate impact on internal and time date
external customers
No significant impact on routine 8 hours within 1 month 95%
operations desk support
times
Disaster Recovery Plan
• A disaster is a sudden, unplanned, calamitous event that creates an inability on an
organisations part to provide the critical business functions for some predetermined
period of time, and which results in great damage or loss.
• A disaster recovery plan (DRP) is a documented process or set of procedures to
recover and protect a business IT infrastructure in the event of a disaster
• Disaster could be natural, or man-made

• Difficult to prevent
• Losses can be avoided with proper precautions
Natural Disaster
• E.g. Earthquakes , fire, flood, hurricane, etc

• Major reasons or failure


• Intentional / Unintentional
Man- made disaster
• Cause loss of communication and utility
• E.g. Accidents, virus, burglars, intrusion

• The objective of a disaster recovery plan is to minimize downtime and data loss
• Minimize the disruption of operations and ensure that some level of organizational
stability and an orderly recovery will prevail after the disaster
Impact of Disaster
The primary objective is to protect the organization in the event that all or part of
its operations and/or computer services are rendered unusable.
• Tangible costs
– Lost data, Lost assets, Lost revenue, Lost Wages, Lost Inventory, Marketing
Costs, Bank Fees/Penalties, Legal Costs, Recovery costs
• Intangible costs
– Lost Opportunity, Employee Retention, Loss in Share Value, Goodwill, Brand
Image, Diminished service reputation, Loss in confidence from partners &
customers
Recovery Point Objective (RPO):
Point in time to which applications data
must be recovered to resume business
transactions
Recovery Time Objective (RTO):
Maximum elapsed time required to
complete recovery of application data
Disaster Recovery
1 • Set up an emergency response plan

2 • Write out each step of the plan

3 • Compile a list of important phone numbers

4 • Decide on a communication strategy

5 • Consider the things you may need

6 • Human resources

7 • Physical resources
Disaster Recovery (DRP) & Business
Continuity Planning (BCP)
• Process / policies related to preparing for continuation after
a disaster
• DRP is a subset of larger process known as BCP
• Resumption of applications, data, hardware, communications,
other IT infrastructure
• Planning for non – IT related aspects such as key personnel,
facilities, crisis communication & reputation protection
Benefits of a DRP
1 • Providing a sense of security

2 • Minimizing risk of delays

3 • Guaranteeing the reliability of standby systems

4 • Providing a standard for testing the plan

5 • Minimizing decision-making during a disaster

6 • Reducing potential legal liabilities

7 • Lowering unnecessarily stressful work environment


Planning a DRP
• Obtaining top management commitment
• Establishing a planning committee
• Performing a risk assessment
• Establishing priorities for processing and operations
• Determining recovery strategies
• Collecting data
• Organizing and documenting a written plan
• Developing testing criteria and procedures
• Testing the plan
• Obtaining plan approval
Business Continuity Plan
• Business continuity planning involves developing a practical plan for how your
business can prepare for, and continue to operate after an incident or crisis
• A business continuity plan will help you to:
– Identify and prevent risks where possible
– Prepare for risks that you can't control
– Respond and recover if an incident or crisis occurs
• The size and complexity of your business continuity plan will depend on your
business
• It will typically include the following sections:
– an introduction, with distribution list, executive summary, objectives and
glossary
– a risk management plan with business impact analysis
– an incident response plan, with plan activation, incident response team,
communications and contact list
– a recovery plan
– a test, evaluate and update schedule
Enterprise Risk Management (ERM)
• Risk management is the identification, assessment, and prioritization
of risks followed by coordinated and economical application of resources
to minimize, monitor, and control the probability and/or impact of
unfortunate events or to maximize the realization of opportunities

• Risk management’s objective is to assure uncertainty does not deflect the


endeavor from the business goals

• Risk Management is a defined set of coordinated activities to direct and


control an organization with regard to risk.

• Risk Management allows an organization to identify risk mitigation


strategies so the organization can achieve its goals
Business Continuity Plan
General Steps to Follow while creating
BCP/DRP
• Identify the scope boundaries of BCP
Step 1 • Limitations & boundaries
• Audit, Risk analysis reports for assets

• Conduct a Business Impact Analysis (BIA):


Step 2 • Study financial losses as unavailability of
important business resources

• Sell the concept of BCP to upper mngt


Step 3 • Important as each one must be able to
understand their role to participate
• Each dept has to be ready in case of disaster
Step 4 • To recover & protect the data ; understand the
plan

• BCP project team must implement the plan


Step 5 • Implementation team must follow
guidelines procedures

• NIST tools can be used with BCP


Step 6 • NIST :- National institute of Standards &
Technology
Class Assignment
• Online Taxes Ltd. is a company that helps people file online taxes for
a small fee in return. Much of the data for filling the online forms
comes from individual records, previous tax documents and other
personal details. They also use the government’s online tax filing
portal to import previous year’s data for customers. They use an
enterprise system to manage customer records. They have also
enabled a mobile app for their representatives to access this data
on the go so that they can consult with their clients in workplace or
home. Recently, there was an untimely rain due to which there was
flooding. This resulted in their operations to come to a halt as the
offices and systems shut down. This impacted the organization
greatly. Also systems often go down during the tax season due to
heavy load on the servers. To make sure this does not happen
again, you have been put in charge of the team responsible for
disaster recovery and business continuity planning.
Questions
• Q1. What should be some of the most
important risks that you would identify and
how would you mitigate them?
• Q2. What would the key points of the disaster
recovery plan you formulate?
• Q3. Outline the key elements of your BCP.
Class Assignment 2
• Neptune Ltd. owns many cinema halls across the
country. The cinema halls are also use to host
events such as prominent cricket matches, movie
festivals etc. The company recently introduced a
venue booking and ticket reservation system. The
system allows individuals as well as corporates to
book venues/seats for events. Tickets cannot be
booked offline and all payments are processed
online. It is a paperless system. It also integrates
into the CRM system for Neptune to track its loyal
customers. Neptune wants a vendor to maintain
and update a system for them.
Questions
• Explain the steps of vendor selection for
Neptune.

• What would be some of the key metrics that


would be specified in Neptune’s SLA with the
technology vendor, given the case context

Potrebbero piacerti anche