Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Software-as-a-Service and
Cloud Services SLAs
2013
Contents
Introduction ......................................................................................................... 2
What is an SLA? ................................................................................................... 2
What is Availability? .............................................................................................. 2
Understanding S-a-a-S and Cloud Services SLAs....................................................... 3
Appendix A: Contractual Walkthrough ..................................................................... 7
Appendix B: SLA Chart ........................................................................................ 13
Uptime SLA Detailed Chart ................................................................................ 13
Uptime SLA Summary Chart .............................................................................. 14
About Intreis ...................................................................................................... 15
Introduction
This document will cover common definitions related Service Level Agreements, the top
ten questions to ask yourself when reviewing S-a-a-S and Cloud SLA, and an actual
contract and SLA walk-though using the Amazon Web Services contract as an example.
Also included in this document are two SLA reference charts for your convenience,
located in Appendix B.
What is an SLA?
Abbreviated SLA, a Service Level Agreement is contract between a provider and the end
user which stipulates and commits the provider to a required level of service.
An SLA should contain:
What is Availability?
Availability is the time during which a device, such as a computer, or a service, such as
web server, is functioning or available for use. Availability depends on many factors
including software stability, hardware load, and infrastructure reliability. The table in
Appendix B illustrates some common availability metrics. For example, a "five-9s"
metric is available 99.999% of the time which is to say that the system or service is
down for just 5.25 minutes in a year. As a comparison, a 99.000% rating may be down
for a total of 87.6 hours in a year which is an average of about 1 hour and 40 minutes
per week.
2
should contain assigned severity and priority levels for each specific type of service
request.
8. Force Majeure
Force majeure: a common clause in contracts that essentially frees both parties
from liability or obligation when an extraordinary event or circumstance beyond
the control of the parties, such as a war, strike, riot, crime, or an event described
by the legal term "act of God" (such as flooding, earthquake, or volcanic eruption),
prevents one or both parties from fulfilling their obligations under the
[2]
contract. However, force majeure is not intended to excuse negligence or
other malfeasance of a party, as where non-performance is caused by the usual
and natural consequences of external forces (for example, predicted rain stops an
outdoor event), or where the intervening circumstances are specifically
contemplated. For Example: a widespread power outage would not be a force
majeure excuse if the contract requires the provision of backup power or other
contingency plans for continuity. http://en.wikipedia.org/wiki/Force_majeure
Read every word in the Force Majeure clause. Then think about what it means to
you as a consumer of the SaaS/Cloud product. Force Majeure is where providers
will make a last ditch effort to abdicate all responsibility. I have actually
seen human error and hardware failure listed under force majeure.
5
In most cases the answer will be no. Many providers will tell you they dont
negotiate SLAs, and in the case of click-through agreements there seems to be no
opportunity to negotiate at all. But, negotiation is always an option! The contract is
your first and best chance to put the relationship with your provider on the right
footing.
10.
For the purposes of doing the calculation I will be using Amazon's Large Standard OnDemand Instance, used for 750 hours on average per month. Total monthly charge
= $360/mo.
Service Commitment
AWS will use commercially reasonable efforts to make Amazon EC2 available with
an Annual Uptime Percentage (defined below) of at least 99.95% during the Service
Year. In the event Amazon EC2 does not meet the Annual Uptime Percentage
commitment, you will be eligible to receive a Service Credit as described below.
1. Personally, I hate the phrase commercially reasonable efforts, its like the
ultimate "get out of jail free card." If the vendor can show They did the best they
could under the circumstances you may not be getting a credit.
2. Annualthis is the word that will really haunt the business users. Most SLA are
calculated monthly but here it is 99.95% over a year, which allows an outage
which is measured in hours as opposed to minutes.
Definitions:
Service Year is the preceding 365 days from the date of an SLA claim
Annual Uptime Percentage is calculated by subtracting from 100% the
percentage of 5 minute periods during the Service Year in which Amazon EC2 was
in the state of Region Unavailable. If you have been using Amazon EC2 for less
than 365 days, your Service Year is still the preceding 365 days but any days prior
to your use of the service will be deemed to have had 100% Region
Availability. Any downtime occurring prior to a successful Service Credit claim
cannot be used for future claims. Annual Uptime Percentage measurements
exclude downtime resulting directly or indirectly from any Amazon EC2 SLA
Exclusion (defined below).
Region Unavailable and Region Unavailability means that more than one
Availability Zone in which you are running an instance, within the same Region, is
Unavailable to you.
Unavailable means that all of your running instances have no external
connectivity during a five minute period and you are unable to launch replacement
instances.
The Eligible Credit Period is a single month, and refers to the monthly billing
cycle in which the most recent Region Unavailable event included in the SLA claim
occurred.
A Service Credit is a dollar credit, calculated as set forth below, that we may
credit back to an eligible Amazon EC2 account.
1. Percentage of five minute periods. Interesting wording so, how do you calculate
the percentage of 5 minutes periods? Let's do some math:
1 year = 525,948.766 minutes
8
3. Region Unavailable and Region Unavailability means that more than one
Availability Zone in which you are running an instance, within the same Region, is
Unavailable to you. Simply put, two or more zones must be down at the same
time with in a region. And, to get a credit you must be running instances in the
affected zones (plural). If you have a single instance in a single zone, there would
be no credit for the outage.
What if the outages were not in one big chunk but rather the instance was down
for five minutes every hour for 60 hours? Is that more or less disruptive? How
would that affect your business?
What if your instance was down every hour for 4 minutes and 59 seconds at
random? You would never be able to make an SLA claim under the current
contract. How would that affect your business? Is that an acceptable risk for you?
errors and corroborate your claimed outage (any confidential or sensitive information in
these logs should be removed or replaced with asterisks); and (iv) be received by us
within thirty (30) business days of the last reported incident in the SLA claim. If the
Annual Uptime Percentage of such request is confirmed by us and is less than 99.95%
for the Service Year, then we will issue the Service Credit to you within one billing cycle
following the month in which the request occurred. Your failure to provide the request
and other information as required above will disqualify you from receiving a Service
Credit.
1. You are required to monitor your instances and provide that data back to Amazon
in the event of an outage. Does your IT department know about this requirement?
Do you test your ability to monitor your instance? Without this data you cannot
make your SLA claim.
2. Interestingly, even though Amazon requires you to provide detailed evidence of an
outage it is still left to Amazons discretion as to whether the credit is given.
Amazon EC2 SLA Exclusions
The Service Commitment does not apply to any unavailability, suspension or termination
of Amazon EC2, or any other Amazon EC2 performance issues: (i) that result from a
suspension described in Section 6.1 of the AWS Agreement; (ii) caused by
factors outside of our reasonable control, including any force majeure event or Internet
access or related problems beyond the demarcation point of Amazon EC2; (iii) that
result from any actions or inactions of you or any third party; (iv) that result from your
equipment, software or other technology and/or third party equipment, software or
other technology (other than third party equipment within our direct control); (v) that
result from failures of individual instances not attributable to Region Unavailability; or
(vi) arising from our suspension and termination of your right to use Amazon EC2 in
accordance with the AWS Agreement (collectively, the Amazon EC2 SLA Exclusions). If
availability is impacted by factors other than those explicitly listed in this agreement, we
may issue a Service Credit considering such factors in our sole discretion.
1. There are several other documents referenced above. You need to always check all
referenced documents and read the language. I have included the referenced
language section below.
2. There is a lot of vague language in this section such as v) that result from failures
of individual instances not attributable to Region Unavailability. I have no idea
what "instances not attributable to Region Unavailability" means and Im sure
most business users dont either. If you dont know what it is ask and have it
included as a definition in the contract.
11
3. Another phrase I hate.In our sole discretion. This is another "get out of jail free
card." Translation: Other things may cause outages that we havent anticipated
and we may or may not give you an SLA credit when they occur.
1. Make sure youre up on your payments or you will not get your SLA credit
2. Ahh, good old Force Majeure. Systemic electrical, telecommunications, or other
utility failures are in my opinion key components for providing Cloud service, and
therefore should not be included in Force Majeure.
12
Calculated Monthly
Calculated Monthly
Calculated
Annually
Calculated
Annually
SLA%
Outage in Minutes
Outage in Minutes
Outage in Hours
Outage in Hours
Outage in Days
99.999%
99.99%
99.95%
99.90%
99.50%
0.0144
0.144
0.72
1.44
7.2
0.432
4.32
21.6
43.2
216
0.0072
0.072
0.36
0.72
3.6
0.0864
0.864
4.32
8.64
43.2
0.0036
0.036
0.18
0.36
1.8
99%
98.99%
98.95%
98.90%
98.50%
98%
Calculated Daily
Calculated Monthly
Calculated Monthly
Calculated Annually
Calculated Annually
Outage in Minutes
Outage in Minutes
Outage in Hours
Outage in Hours
Outage in Days
14.4
14.544
15.12
15.84
21.6
28.8
432
436.32
453.6
475.2
648
864
7.2
7.272
7.56
7.92
10.8
14.4
86.4
87.264
90.72
95.04
129.6
172.8
3.6
3.636
3.78
3.96
5.4
7.2
97.99%
97.95%
97.90%
97.50%
97%
96.99%
96.95%
96.90%
96.50%
Calculated Daily
Calculated Monthly
Calculated Monthly
Calculated Annually
Calculated Annually
Outage in Minutes
Outage in Minutes
Outage in Hours
Outage in Hours
Outage in Days
28.944
29.52
30.24
36
43.2
43.344
43.92
44.64
50.4
868.32
885.6
907.2
1080
1296
1300.32
1317.6
1339.2
1512
14.472
14.76
15.12
18
21.6
21.672
21.96
22.32
25.2
173.664
177.12
181.44
216
259.2
260.064
263.52
267.84
302.4
7.236
7.38
7.56
9
10.8
10.836
10.98
11.16
12.6
*Providers with SLAs that fall into this section should not be considered if Mission
Critical applications are in question.
13
96%
95.99%
95.95%
95.90%
95.50%
95%
94.99%
94.95%
94.90%
94.50%
Calculated Daily
Calculated Monthly
Calculated Monthly
Calculated Annually
Calculated Annually
Outage in Minutes
Outage in Minutes
Outage in Hours
Outage in Hours
Outage in Days
57.6
57.744
58.32
59.04
64.8
72
72.144
72.72
73.44
79.2
1728
1732.32
1749.6
1771.2
1944
2160
2164.32
2181.6
2203.2
2376
28.8
28.872
29.16
29.52
32.4
36
36.072
36.36
36.72
39.6
345.6
346.464
349.92
354.24
388.8
432
432.864
436.32
440.64
475.2
14.4
14.436
14.58
14.76
16.2
18
18.036
18.18
18.36
19.8
*If you are looking at a provider that falls into this category look elsewhere. If you have an existing
provider whose uptime is falling into this section immediate and definitive action should be taken.
14
About Intreis
Intreis is a Chicago based consulting firm specializing in IT Governance Risk &
Compliance and IT Service Management integrations. Intreis also offers a wide range of
services which support ITGRC and ITSM integrations including: Assessments, Controls
Definition, Process Design, Remediation Work, Risk and Compliance Strategy, Training
and Education. For more information about Intreis services, please visit us at
www.intreis.com .
Created By:
Morgan Hunter, VP Professional Services
Email: Morgan.Hunter@Intreis.com
Twitter: @Intreis
Web: www.Intreis.com
Copyright Intreis, Inc. 2013. All rights reserved. No part of this publication may be
reproduced, or distributed without the prior written permission of Intreis.
15