Sei sulla pagina 1di 14

Future Generation Computer Systems (

Contents lists available at ScienceDirect

Future Generation Computer Systems


journal homepage: www.elsevier.com/locate/fgcs

Capacity-driven utility model for service level agreement negotiation


of cloud services
Nadia Ranaldo, Eugenio Zimeo
Department of Engineering, University of Sannio, Italy

highlights
We propose a capacity-aware utility model to support negotiation of cloud services.
The utility function takes into consideration the available resources dynamically.
The approach improves the provider utility and reduces SLA violations.

article

info

Article history:
Received 15 May 2014
Received in revised form
18 January 2015
Accepted 9 March 2015
Available online xxxx
Keywords:
Cloud computing
QoS management
SLA
Negotiation
Capacity planning

abstract
Dynamic customers requirements and providers resources availability in the Cloud market make it
inadequate static approaches to guarantee Quality of Service (QoS) levels and to define pricing. In this
context, negotiation guided by dynamic information is a viable way to achieve high satisfaction levels
for both contract parties. We propose to exploit capacity planning to support bilateral negotiation
processes with the aim of optimizing the utility for service providers, by avoiding contracts that could
incur in Service Level Agreements (SLAs) violations, keeping, at the same time, competitive prices. The
proposed technique exploits a non-additive utility function defined in the region of acceptable SLA
proposals, taking into account desired QoS and expected resources availability, costs and penalties. The
experimental analysis shows the benefit of the proposed dynamic approach with respect to static ones
in a scenario characterized by a set of customers and differentiated classes of applications provided by a
cloud environment.
2015 Elsevier B.V. All rights reserved.

1. Introduction
Service Level Agreements (SLAs) [1] represent key elements to
achieve full success in Cloud computing, since they represent the
desired guarantees between service providers and customers. SLAs
allow to formally describe the offered functions, the QoS levels
the provider promises to meet, the responsibilities [2] of both the
contract parties, and the penalties applied in case QoS levels are
not satisfied.
Platform as Service (PaaS) providers (e.g. Google App Engine
and Force.com), often offer a pool of differentiated services with
pre-fixed prices, related to the complexity of the deployed applications, measured through metrics such as the number of applications and database objects. For these services, SLAs are currently

Correspondence to: Department of Engineering, University of Sannio, via


Traiano, Benevento, 82100, Italy. Tel.: +39 0824 305538; fax: +39 0824 30552.
E-mail addresses: ranaldo@unisannio.it (N. Ranaldo), zimeo@unisannio.it
(E. Zimeo).
http://dx.doi.org/10.1016/j.future.2015.03.007
0167-739X/ 2015 Elsevier B.V. All rights reserved.

used to define only the granted service availability (uptime) level


and a credit-based penalty system in case of violation. They do not
offer, yet, the possibility to define custom agreements that could
better satisfy both customers and providers.
Coarse-grained and static QoS guarantees are no longer satisfactory in a market characterized by continuously changing conditions. They, in fact, require providers to quickly react in order
to maintain high levels of competitiveness and customer satisfaction (birth of new high competitive providers, customers demand
of cloud services for new business fields, fluctuations of electrical
power price, optimal data center resources exploitation).
In this dynamic scenario, negotiation of fine-grained SLAs could
be a viable approach for service providers to be competitive and
to reach more profitable agreements for both customers and
providers [3].
The level of flexibility of the negotiation process depends on the
underlying protocol. It could be (1) unilateral, if a party (typically
the provider) proposes a SLA and the other one can only decide to
accept or reject it, or (2) bilateral, if both the parties have an active
role in proposing and defining SLAs. The latter allows to resolve

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

conflicts deriving by different and continuously changing goals,


policies and preferences of customers and providers through dialog
between them.
In many bilateral negotiation strategies, each negotiation actor
adopts a decision model based on a utility function, which
represents the (perceived) satisfaction level associated to a SLA
proposal. In particular, given n negotiable SLA parameters, the
utility function assigns a utility value to each point in the
corresponding n-dimensional space of such parameters. The region
in such space in which the utility value is considered acceptable
during the negotiation process is called acceptable region. Each
point in this region represents a SLA proposal and has a utility value
between a minimum and a maximum.
Since customers and providers adopt different utility functions
that are not known to the counter-parts, an agreement is possible
only if the intersection between the two acceptable regions, called
negotiation space, is not empty. In this case, an agreement is a point
in the negotiation space where the utility assumes a satisfactory
value for both customer and provider.
An agreement is reached through a process, typically based
on time. For example, time-based decision functions [4] allow
to make time-dependent concessions with respect to an initial
utility value (e.g. the maximum one) with the aim to reach the
agreement within a prefixed negotiation time. In particular, when a
SLA proposal is received from a negotiation party, the parameters
values are verified against the acceptable region and, if they are
admissible, the related utility value is computed. On the basis of
such evaluations and of elapsed time, the strategy makes decisions
about the acceptance or rejection of the proposal, the counterproposal generation or negotiation termination.
In the literature, typically, the decision models are based
on multi- and independent-attribute utility functions that are
additive with respect to negotiation parameters, that is, the utility
can be evaluated considering one parameter at a time, and the
total utility can be computed by adding (linear combination) the
utility contributions derived from the value of each negotiable
parameter [5]. With this approach, a SLA proposal is acceptable
if each negotiable parameter value is within the corresponding
interval of acceptable values.
In a more realistic Cloud market, some negotiable parameters,
such as price and QoS levels, cannot be considered additive
independent: the service price depends on resources cost, that,
in turn, depends on the agreed QoS terms. Moreover, utility
should take into account strategic business policies and dynamic
information about the negotiation context, such as market
trend, actual customers requirements and providers resources
availability and performance. In fact, before a SLA is signed, the
provider has to check whether the requested set of resources
will be available when desired, to avoid future SLA violations.
Moreover, an offer with the same QoS level and price could be
accepted (refused) on the basis of different conditions: sustainable
(not sustainable) service usage conditions (e.g. the forecasted daily
load peak) and a high (low) competitive market phase, also in case
of potential economic loss.
In this paper, we focus on bilateral SLA negotiation of PaaS
services for hosting multi-tier Web applications in a scenario
where the number of users is variable and the workload is not
stationary but, typically, exhibits peaks and dips with daily, weekly
or also seasonal cycles [6,7].
In order to meet QoS terms, the provider allocates appropriate
resources to each tier of the application architecture. Currently,
we adopt replication only at the application server tier, while
a Web server is used as a load balancer and a single database
server is shared. Thanks to virtualization technologies, replication
is dynamically managed by a resource management system which
handles a set of independent and homogeneous virtual machines
(the overall Cloud provider capacity).

The virtual machines, allocated on a set of hardware resources


of the provider data center, are exploited to host various instances
of the application server. The number of virtual machines,
allocated to the application server tier of each signed SLA, changes
dynamically during the day by means of a predictive resource
allocation mechanism. This mechanism aims to define the best
resource allocation plan able to maximize the profit and to avoid
QoS violations under a daily fluctuating workload.
The proposed utility model, which dynamically defines the
acceptable regions on the basis of available capacity, is used at
provider-side to guide negotiation strategies. To this aim, the
adopted utility function is non-additive to represent the overall
provider economic profit deriving by a new contract, net of cost
of assigned capacity, penalty payment in case of QoS guarantees
violations and eventual variation in profits of already signed SLAs.
Utility is a function of two negotiable parameters, which are the
contract price (una-tantum payment) and the maximum response
time that can be perceived by the end-user without incurring in
a penalty, and other non-negotiable parameters (constraints and
pre-conditions). These constraints and preconditions are defined
by both customer (contract duration and starting day, application
component size, forecasted daily workload plan) and provider
(service availability and penalty).
Price is a function of capacity cost and market conditions.
Capacity cost depends on the product virtual machines x daily time
slots assigned to a SLA, whereas market conditions (monopoly vs
competition) are captured by using two factors that express (a) the
probability to choose that provider and (b) the possible profit.
From the considerations above, the proposed utility function is
based on effective customers requirements, specified in the initial
negotiation phase (as pre-conditions), and on a capacity planning
technique, which suggests the best profitable resources allocation
plan for every new SLA by avoiding (or reducing) violations.
To validate the proposal and to show the benefit in predicting
utility of new potential contracts, an in-depth experimental
analysis has been conducted.
1.1. Main contribution
To the best of our knowledge, this is the first proposal that
adopts capacity planning in the first phase of a contract life-cycle
to guide bilateral negotiation strategies through the definition of
the providers acceptable region and utility value. By adopting
this approach, the provider reduces the risk of incurring in SLA
violations since the technique allows to find actual free slots in the
global resource allocation plan, which is defined by considering the
resources needed to satisfy all the signed SLAs. Our proposal, unlike
the traditional ones based on additive and static utility functions,
allows the provider to propose, during negotiation processes,
offers with competitive prices and feasible performance. Moreover,
it maintains the potential violation of QoS terms under fixed
tolerable levels and avoids the stipulation of new contracts in case
they conduct to unprofitable revenues or customer unsatisfaction.
Our proposal was firstly presented in [8], in which a preliminary
experimental validation, based on a simple linear application
performance model was adopted to investigate the proposed
utility function and the capacity-driven evaluation technique. A
more realistic experimental scenario and the comparison of the
proposed approach with the traditional one based on additive
utility functions have been presented in [9].
This paper extends both the previous ones, giving deeper details
about the utility function formalization and the heuristic adopted
for its evaluation, and presenting an in-depth experimental
analysis to validate the approach. The analysis shows the
achievement of high satisfaction levels for both providers and
customers: providers can gain advantages both in the short period

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

(economic profit) and in long period (customer loyalty), whereas


customers can stipulate contracts with more competitive prices and
with greater guarantees on QoS terms fulfillment.
The rest of the paper is organized as follows. Section 2 discusses
related work. Section 3 introduces the SLA model for the PaaS service and the capacity model. Sections 4 and 5 present, respectively,
the proposed utility model and a dynamic evaluation technique
based on capacity planning. Section 6 shows the benefit of the proposed technique with respect to static approaches through experimental analysis. Finally, Section 7 presents conclusion and future
work.
2. Related work
Many bilateral negotiation strategies proposed in the literature
are based on a multi-attribute additive utility function, assuming
that the negotiable parameters are additive independent of each
other. The most popular form is given by a linear combination
of linear utility functions defined for each negotiation parameter
and normalized within the corresponding acceptable interval of
values [4]. Moreover, the utility functions and related acceptable
regions are defined statically and require human intervention
[1012], limiting the applicability of those approaches in highly
dynamic environments, such as Cloud.
The paper [13] proposes the adoption of non-additive multiobjective utility functions for satisfying both business and performance goals in unilateral negotiation of Cloud services for Grid. The
utility function takes into account various objectives (economic
revenue maximization and reputation, priority to tasks or services
executed in off-peak hours). When a provider receives a proposal,
the utility function is maximized, taking into account economic
factors and resources availability information, to propose an offer
to the customer. We propose a similar approach (based on the maximization of a non-additive utility function) but we define dynamic
acceptable regions based on short-term capacity planning for more
flexible bilateral negotiation processes.
Spillner and Schill in [14] propose the semi-automatic adjustment of SLA templates, published by providers in an advertisement service registry and adopted as starting point of negotiation
processes. Their approach is based on a performance prediction
model, that exploits both run-time and historical monitoring data,
to define the sustainable QoS level before reaching the resource
limit and eventually incurring in SLA terms violations. Differently
from this proposal, which requires manually adjustment of SLA
templates by providers, we propose a capacity planning technique to drive a negotiation process that follows a business policy
automatically.
Some papers face with bilateral negotiation mechanisms based
on non-additive utility functions, which are more realistic than
additive ones. As an example, the paper [15] proposes two
negotiation models able to handle non-monotonic and discrete
non-linear utility functions, based respectively on a multiple
offer and an approximating mechanism. The paper [16] proposes
the definition of a bilateral negotiation protocol based on an
alternative projection strategy for generating offers with the aim
to reach, in a finite time, a satisfying agreement among agents
characterized by a generic nonlinear utility function. However,
these papers adopt pre-defined nonlinear utility functions and do
not face with their realistic modeling that should take into account
capacity availability and cost, customers requirements and market
conditions.
Capacity planning of IT infrastructures, both for optimized
short-term resource management and long-term investment
plans, can be employed by service providers to manage SLAs and
promised QoS levels in the most profitable way [17].

The problem of a self-adaptive capacity planning for optimizing economic profits related to SLA of Internet Services has been
investigated in the literature. Some approaches leverage the queuing theory to solve an optimization resource allocation problem
under constraints on the service rate. In particular, the paper [18]
takes into account the profit with respect to the penalty and [19]
the reward in case a surge workload is supported. The paper [20]
proposes a capacity planning method based on a queuing network
approach and an analytical performance model of Cloud service to
predict the optimal configuration of a Cloud application, considering both the provider profit and the customer satisfaction levels. As
in our proposal, resource virtualization for performance isolation
and dynamic resource allocation are exploited. However, these papers apply capacity planning techniques to optimally manage data
center capacity, by considering a set of signed SLAs, but not to define the offer space and the potential utility for the provider of new
SLAs under negotiation.
The paper [21] deals with the dynamic capacity allocation
of a data center to a hosted application on the basis of the
workload demand, in order to reduce SLA violations and power
consumption. To this aim, historical workload traces are analyzed
to identify a base daily pattern represented by an aggregation
(discretization) of the original workload time-series. The pattern
is defined through a dynamic programming problem aiming to
minimize the discretization error and the number of intervals in
the discretization. Overloading is handled with capacity increase
at short-term scale.
In our paper, we consider customer workload as a pre-condition
for SLA negotiation and capacity planning is adopted in this
initial phase of contract life-cycle. The technique proposed in the
paper cited above could be used in our solution during resource
utilization phase (after a contract has been established) and to
define a new customer profile (workload definition) for successive
SLA negotiations.
3. A PaaS for web hosting
Resource allocation strategies of a PaaS can be influenced
by application requirements. We classify applications according
to two dimensions: the functional and technological ones. The
former affects the number of users simultaneously accessing
the application, the way they interact with it, and, as a
consequence, the aggregated request rate. This dimension is
defined by the following sub-dimensions: the application class,
the interactive level (interactive-intensive vs. batch processing),
and geographic extension (national vs. continental or worldwide spread notoriety). The technological dimension characterizes
the way the data center resources are mainly exploited by
the application. It affects the performance pattern exhibited by
executing the application on a pool of resources. The related
sub-dimensions are: architecture (monolithical vs. multi-tier
architecture), and memory management (volatile vs. persistenceintensive data management).
We focus on a PaaS for hosting Web applications, called
Virtual Web Platform (VWP) service. A VWP service offers a virtual
platform used to host interactive-intensive Web applications with
national geographic extension, composed of multiple components,
deployed on the provider resources according to a multi-tier
architecture and with a volatile-intensive data management.
3.1. SLA model
The SLA model is structured in four sections: (1) service
description, (2) QoS target, (3) measurement and (4) penalty model.

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

The service description defines the contract validity period,


denoted as number of days D, the starting day B, the total service
price P, expresses in euros (e).
The QoS target section defines the QoS guarantee terms and the
service usage conditions under which the provider is responsible
for such guarantees terms (pre-conditions). Pre-conditions are
defined through the workload plan W expected for the Web
application with a daily pattern, modeling the typical request
rate fluctuation of Web applications [6,7]. A day is partitioned in
K disjoint time intervals, each characterized by a representative
workload value wk , k : 1, . . . , K . wk is expressed as number of
requests per second (r /s) and is based on the number of requests
received and completed during the k-th time slot.
We consider two QoS guarantee terms: (1) the response time
must not overcome the maximum response time T expressed
in seconds (s); (2) the service availability, expressing the time
percentage during which the service is capable of being used, must
be equal or greater than the value MinAvail defined by the provider.
The service is considered available when the difference between
the measured response time and T is less than the maximum
allowed value, indicated with tmax .
The measurement section defines the measurement processes
for response time, workload and service availability in order to
monitor service usage conditions and QoS guarantee terms.
The response time is influenced by various delay components,
some of them are not under the provider responsibility (such as
data transfer delay outside the data center performed on public
networks used without contract regulation). For this reason, we
define the response time as the time interval beginning from the
HTTP request receipt on the provider infrastructure to the HTTP
response completion and transfer beginning. Moreover, since the
response time depends on the processing time of the specific invoked Web component, in order to obtain comparable measures,
we introduce a customized Web component, called Benchmarking
Web Component (BWC ), defined by the customer to characterize
the Web application in terms of typical operations load. Finally, we
define as measurement sample the average of a set of single measurements retrieved during a small interval time, called monitoring
time unit. This approach avoids to detect, as QoS violation, isolated
performance degradations during transitory situations, typical of
adaptation actions on resource allocation.
The penalty model defines the penalty that the provider must
pay when the QoS guarantee terms are not satisfied. We consider
a penalty directly proportional to price P, QoS violation degree
and duration. In particular, it is expressed as the summation of
elementary penalties eventually derived from each monitoring
time unit. Indicated with Pendkj , d : 1, . . . , D, k : 1, . . . , K , j :
1, . . . , J the penalty in the j-th monitoring time unit of the k-th
time slot of the d-th contract day, the total penalty, Pen, is given by:
Pen =

J
D
K

Pendkj .

(1)

d=1 k=1 j=1

Pendkj depends on the difference between the measured


response time and T , denoted as tdkj , and on price related to a
monitoring time unit, denoted as p, as follows:

Pendkj

t
dkj
= p

t
max

with > 0, p =

P
DKJ

if tdkj 0
if 0 < tdkj < tmax
if tdkj tmax

(2)

A summary of parameters characterizing an SLA signed and


managed by a provider, and the manner in which they are defined,
is reported in Table 1.

Fig. 1. Capacity model.

3.2. Capacity model


To meet QoS guarantee terms, the Cloud provider of VWP
services must allocate appropriate resources to each tier of the
hosting platform of multi-tier Web applications. Currently, we
adopt a replication schema to the application server tier, while the
Web server is used as a load balancer and a unique database server
is shared.
The mapping of an application server replica onto hardware resources of the provider data center is managed by a resource virtualization layer so that each virtual machine hosts an application
server (see Fig. 1). Under this assumption, we model the overall
Cloud capacity as the set of M independent and homogeneous virtual machines that can be simultaneously launched on the data
center hardware resources. We assume that such virtual machines
have the same provisioned performance, measured through an application benchmarking technique.
The number of virtual machines allocated to the application
server tier of each signed SLA changes, during the day and for each
contract day, adopting a dynamic resource provisioning technique.
Such technique is, currently, guided by the solution to a utility
optimization problem which defines the best capacity allocation
plan able to maximize the profit and to avoid QoS violations
under different workloads provisioned in daily time slots. The
optimization problem, that is the same adopted for the definition
of acceptable regions and utility values of SLA proposals during
negotiation phase, is formulated in Section 4 and a heuristic to find
its solution is discussed in Section 5.
The capacity performance model adopted by the utility
optimization problem is based on a benchmarking technique that
exploits the measurement process described in Section 3.1. It
defines the performance function t (n, w), a function that returns the
response time than can be reached assigning n virtual machines to
the application server tier under workload w . On the other hand, a
more accurate provisional model, based on monitoring data during
service operation and/or Web workload modeling techniques,
could be adopted in the future, with the aim to improve estimation
and reduce service costs (e.g. by avoiding energy consumption of
assigned but not-exploited resources).
Given R = {SLAi }, i : 1, . . . , N the set of already signed SLAs,
the capacity assigned (called also resource allocation plan) to the

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

Table 1
SLA parameters.
D
B

wk , k : 1 , . . . , K

Contract validity duration


Starting day of the contract validity
Workload rate that must be supported during the k-th daily time slot

Negotiated

T
P

Maximum response time


Contract price (all-inclusive)

Fixed by the provider

MinAvail
tmax

Minimum service availability (percentage)


Maximum difference between the measured response time and T to consider a service available.

Fixed by the customer

VWP service related to each SLAi is denoted by NiR = {nRidk }, i :


1, . . . , N , d : 1 :, . . . , D, k : 1, . . . , K . It defines the number of
virtual machines assigned in each time slot k of each contract day
d. In the following, we consider the same time slot partitioning (K)
and contract validity period (D and B) for all SLAs.
The fraction of capacity allocated to a VWP service has an economic cost for the provider, that we model through a cost function
linearly proportional to the resource allocation plan. In particular,
the capacity cost, Ci , related to a signed SLA, SLAi , is given by:
Ci = c

D
K

nidk ,

(3)

d=1 k=1

where c represents the elementary cost for virtual machine usage


per time slot.
4. Utility model
The proposed utility model evaluates the profit that the
provider achieves by accepting a new contract with the related
SLA, indicated as SLAN +1 , taking into account QoS guarantee terms,
service usage conditions, capacity availability and utility deriving
from each already signed contract in R.
The utility, denoted by U (TN +1 , PN +1 ), deriving by the new
contract is defined as the difference between the overall profit
accommodating the new contract and the one gained by the already signed contracts without accepting the new one. Adopting an
additive model with respect to the profit deriving from each single
SLA, utility is given by:
U (TN +1 , PN +1 ) =

N +1

Ui (SLAi , Ni )
Q

i=1

UiR (SLAi , NiR ),

(4)

i=1

where Q denotes the union of R (already signed SLAs) with SLAN +1 ,


Q
Q
and Ni = {nidk } the resource allocation plans of SLAs considering
Q

the new set Q of signed SLAs. UiR and Ui are the contract utility deriving by each SLA considering respectively R and Q as the set of
signed SLAs. Generically, contract utility Ui is given by:
Ui (SLAi , Ni ) = Pi Ci Peni ,

(5)

where Pi is the contract price, Ci is the cost of assigned capacity and


Peni is the provisioned penalty, all depending on SLAi and Ni .
In order to vary the price in response to changing market supply
and demand, we adopt a dynamic price function, whose minimum
value is the cost of the resources identified by the allocation plan
(capacity cost Ci ). In particular, Pi is defined as
Pi = Ci ( des + (1 des)g ),

> 1, 0 < des 1, 0 g 1,

(6)

where the profit factor , evaluated on the basis of historical


market data about accepted/average price, expresses the possible
profit deriving by a new contract, given its cost. The interest
level of the provider in signing a new contract, that captures the
market model (monopoly vs competition), is modeled through
the parameter des, calculated as the probability for a provider

to be chosen by a customer among the available ones. Price is


inversely proportional to des, in fact, the higher is des, the higher is
competitiveness among providers, and as a consequence, lower is
the price to increase customers interest. Both and des are timedependent factors, since information about market conditions
varies in time.
Finally, g is a factor, useful during the negotiation process, to
vary the price between the minimum value, obtained with g =
1 (the capacity cost, Ci ), and the maximum allowed one, given
the current market conditions, obtained with g = 0 (the value
Ci des).
Peni is evaluated on the basis of the provisioned application
performance captured by the performance function t (n, w),
already introduced in Section 3.2, adopting the assigned capacity
and the workload plan declared by the customer.
The acceptable region, indicated with AR, represents the region
in the bi-dimensional space of negotiable parameters (TN +1 , PN +1 )
for the SLA proposals of the new contract, whose utility U is
acceptable. For the multi- and independent-attribute utility model
proposed in [4], the acceptable region is composed of all proposals
whose parameters values are within their respective acceptable
intervals defined by means of the related static minimum and
maximum values. On the contrary, for the proposed utility model,
negotiation parameters are not independent, in particular the
interval of acceptable prices depends on the cost of the resource
allocation plan, that, on its turn, depends on the desired maximum
response time.
We formalize AR as follows: indicated with [Tmin , Tmax ] = IntT
the interval of acceptable response times, called acceptable performance interval, a proposal (TN +1 , PN +1 ) belongs to AR if TN +1 is
contained within the acceptable performance interval and if PN +1
belongs to the interval of acceptable prices related to TN +1 , called
acceptable price interval, indicated with [Pmin (TN +1 ), Pmax (TN +1 )] =
IntP (TN +1 ):
AR = {(TN +1 , PN +1 ) : TN +1 IntT and PN +1 IntP (TN +1 )}.

(7)

5. A heuristic for utility evaluation based on capacity planning


The proposed heuristic adopts SLA parameters of a negotiation
request and dynamic information about capacity availability to
decide whether the request can be handled and, in positive case,
to evaluate the acceptable region and utility value of proposals in
that region. In particular, the hosting of a new service is guided by
the principle of optimizing utility (4): given a proposal with certain
values for price and response time, a capacity planning problem
is performed in order to find the optimal resource allocation plan
that allows to obtain the best utility value, taking into account
the available resources in various time slots of the contract validity
period and the utility gained by the already signed SLAs. Moreover,
it is necessary to define the conditions under which such utility
(and the related proposal) is considered acceptable. The problem of
Q
finding the best resource allocation plan NN +1 , related to SLAN +1
is formulated as follows:
max(U (TN +1 , PN +1 )),

(8)

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

nidk M ,
Q

d, k.

less than MinAvail. Considering all the contract validity duration


D, it can be expressed as follows:

(9)

i =1
opt

nidk nidk (Ti , Wi ),


Q

nidk 0, nidk nmax ,


Q

opt

Service availability condition: the percentage of nQN +1dk whose related response time overcomes TN +1 more than tmax must be

constrained by
N +1

i, d, k.

nQn+1dk : (t (nQn+1dk , wN +1k ) TN +1 ) > tmax

(10)

nidk N , i, d, k
Q

< MinAv ail DK .

(11)

opt

where Ni (Ti , Wi ) = {nidk (Ti , wik )}, i : 1, . . . , N + 1, is the optimal


resource allocation plan of SLAi given a certain maximum response
time Ti and workload plan Wi . It is the allocation plan that, at
minimum cost, allows to obtain a response time less that Ti in each
time slot k under the related workload wik :
opt

nidk (Ti , Wi ) = min(min(n : t (n, wik ) Ti ), nmax ).

(12)

where nmax is the maximum number of assignable virtual machines


to a VWP service in a time slot.
The optimization problem (8) finds the resource allocation
plans for all SLAs that maximize utility (4), taking into account
the effective available capacity (constraint (9)). Constraint (10)
maintains at minimum the cost for the new contract, allowing to
avoid waste of resources and to offer competitive prices. It states
that, for each contract, the number of assigned virtual machines, in
each time slot, must be less or equal than the optimal one.
We consider two resource allocation policies, the conservative
and the progressive one. With the former, the allocation plan for
the new service is spread out on effective available resources and
does not affect the allocation plan of the already signed services.
With the progressive allocation policy the hosting of a new service
takes into account changes in the allocation plans for the already
stipulated contracts, so potentially causing a variation in their cost,
penalty and utility. In the latter case, in addition to constraints
(9)(11), other constraints can be formulated to limit re-allocation
actions that could cause uncontrolled reduction of utility deriving
by each contract and related customer satisfaction level. These
constraints are, for example, limitations on the maximum number
of virtual machines that can be subtracted to the already signed
SLAs and on the maximum performance degradation.
The utility optimization problem (8) can result in negative
or positive values of utility, in case the proposal leads to a loss
of profit for the provider or to an effective gain, respectively.
The overall business policy is responsible to dynamically guide
the decision whether a proposal is satisfactory or not, with a
certain competitiveness level. In particular, we adopt the following
parametric conditions under which a proposal (TN +1 , PN +1 ) is
defined acceptable:

Response time acceptability condition: a response time TN +1 is acceptable if the utility of the proposal (TN +1 , Pmax (TN +1 )) is equal
opt

or greater than percentage PUmax of the utility Umax that can


opt
be gained with NN +1 (TN +1 , WN +1 ) and the related maximum
price:
opt
U (TN +1 , Pmax (TN +1 )) PUmax Umax
.

(13)

An high value of PUmax reduces the risk tied to penalty payment


when the optimal allocation plan cannot be assigned at all. This
condition influences the acceptable performance interval and,
in particular, Tmin .
Price acceptability condition: price PN +1 is acceptable, in relation to TN +1 , if the corresponding utility is included between
the minimum and the maximum one. In particular, the maxiQ
mum utility, Umax
(TN +1 ), is the one corresponding to the maximum allowed price (g = 0 in (6)), and the minimum utility,
Q
Q
Umin (TN +1 ), is a percentage, PUmin, of Umax
(TN +1 ):
Q
Q
PUmin Umax
(TN +1 ) U (TN +1 , PN +1 ) Umax
(TN +1 ).

(14)

(15)

5.1. A heuristic
In order to evaluate the utility in a computationally feasible
manner, we propose a heuristic aiming to find an approximation of
the utility function and of related acceptable region solving (8) for a
limited number of cases and exploiting an interpolation technique.
The algorithm consists of the following main steps:

Step 1: Evaluation of Tmax , IntP (Tmax ), and the utility considering


Tmax and the boundaries of IntP (Tmax );
Step 2: Evaluation of Tmin , IntP (Tmin ), and the utility considering
Tmin and the boundaries of IntP (Tmin );
Step 3: Evaluation of a set of Z response times within IntT , called
Tz s, each IntP (Tz ) and utility considering Tz s and the boundaries
of related IntP (Tz ).
Step 1
Tmax is the maximum response time provisioned in the K time
slots adopting the minimum number nmin of assignable virtual
machines:
Tmax = max(t (nmin , wN +1k ))K .

(16)

Since we consider interactive-intensive applications, we limit


the value of Tmax to an upper bound, called TBOUND, beyond which
we consider the response time not acceptable for the customer.
IntP (Tmax ) is evaluated through an iterative approach. The first step
is to solve (8) adopting the conservative allocation policy. In this
case, (4) becomes:
U (TN +1 , PN +1 ) = PN +1 CN +1 PenN +1 .

(17)
Q

In order to maximize (17), we find the allocation plan NN +1


adopting a best effort capacity planning approach taking into
account constraints (9)(11):
Q

NN +1 = {nN +1dk }

max

N +1

Q
nidk

opt
nN +1dk

(Tmax , WN +1 )

(18)

i=1
Q

Then, if the service availability condition for NN +1 is satisfied,


CN +1 is evaluated adopting (3). Further, the response time
acceptability condition (13) is checked evaluating Pmax (Tmax ) and
the relative penalty, adopting respectively (6) with g = 0 and (1):
Pmax (Tmax ) = CN +1 des.

(19)

If such condition is satisfied, the price acceptability condition


(14) is adopted to evaluate Pmin (Tmax ). Considering the conservative
resource allocation policy, it can be evaluated directly starting
from Pmax (Tmax ), since different prices do not influence the best
assignable allocation plan for problem (8).
If the service availability and the response time acceptability
conditions are not satisfied, the progressive allocation policy is
taken into account. The basic idea is the following: for each time
Q
slot in which the number of allocated virtual machines, nN +1dk , is
opt

less than both nmax and the optimal number nN +1dk , we find a reallocation plan, involving the already signed SLAs, that leads to the
greatest utility increase. The process is stopped when the acceptability conditions are satisfied. If the re-allocation actions do not allow to satisfy the acceptability conditions, this means that the new

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

(a) Office.

(b) Business.

(c) Private.

Fig. 2. Normalized workload plans for various applications classes.

SLA does not lead to an acceptable utility for any value of price and
response time. In this case the acceptable region cannot be defined
and the current negotiation request is refused.
Step 2
Tmin is evaluated adopting an iterative approach aiming to
find the minimum response time that satisfies the acceptability
conditions. Initially, Tmin is defined as the minimum response time
obtained in each time slot:
Tmin = min(t (nN +1dk , wN +1k ))D,K .

(20)

Assuming to adopt the conservative allocation policy and a


monotonically increasing trend of the application performance
function, (20) can be expressed as follows:
Tmin = min(t (nN +1k , wN +1k ))K ,

with nN +1k = min

nidk

, nmax

i =1

k.

(21)

If the acceptability conditions are not satisfied, the progressive


allocation policy is exploited. If, also in this case, the acceptability
conditions are not satisfied, greater values of Tmin are attempted. In
particular, an attempt value is obtained summing to the previous
one a little amount > 0, until the acceptability conditions are
satisfied or Tmax is reached. If a Tmin (Tmax ), > 0, is found,
the heuristic proceeds to the next step. On the contrary, it stops
and the negotiation request is refused.
Step 3
Because of non-linearity of the model, the utility function is
evaluated for the set TZ of response times internal to IntT . A simple
technique to define TZ is based on the partition of the interval IntT
into equal-length parts (but other solutions could be adopted):
Tz = Tmin + z

Tmax Tmin
z+1

(22)

For each Tz , IntP (Tz ) and utilities considering the boundaries of


such interval are evaluated adopting the acceptability conditions
in a similar manner to Step 1.
Given a generic proposal (TN +1 , PN +1 ), it is considered within
the acceptable region under the following conditions:

Tmin TN +1 Tmax ,
Pmin (TN +1 ) PN +1 Pmax (TN +1 ).
Pmin (TN +1 ) and Pmax (TN +1 ) are evaluated adopting an interpolation technique which takes into account the distance between
response time TN +1 and the nearest response times in the set {Tz }.
6. Experimental results and discussion
In this section, we analyze the non-linear behavior of the
proposed non-additive utility model with respect to price and

response time, its dependency on the workload plan and available


capacity, and the heuristic accuracy. Moreover, we discuss the benefit that the dynamic evaluation approach introduces with respect
to the static one in terms of satisfaction level for both providers and
customers. To this end, we compare the dynamic approach with
two static ones: (1) the additive multi-parameter (AMP) utility
model, statically defined, typically adopted by negotiation strategies driven by time-based decision functions [4]; (2) the proposed
utility model, statically evaluated before processing any negotiation request.
6.1. Experimental setup
The experimental results are related to the conservative
resource allocation policy and the following parameters:

the contract validity period D (the same for all the contracts)
has a duration of 180 days;

static competitive (balanced demand/supply) market conditions, characterized by a provider interest level des = 0.5;
= 4 for price evaluation, c = 4.17 10 3 e, for cost
evaluation;

regarding the service availability condition, MinAv ail = 100%


and tmax = T : the service has to be always available and a
proposal is not acceptable for the provider if there is a time slot
in which the number of assignable resources leads to a response
time that overcomes 2T . In this case, the service acceptability
condition can be expressed as follows:
nidk : t (nidk ) 2T ,
Q

i, d, k.

(23)

TBOUND = 2 s;
PUmax = 30% and PUmin = 10%;
= 5 ms and = 5 ms, for Tmin definition;
nmax = 2000, nmin = 1;
the number of virtual machines initially available in each daily
time slot is M = 20 000;
Z = 28.
We define the daily workload pattern for three application
classes (see Section 3.1): (1) office: support of office work and tertiary industry, (2) business: support to production cycle of manufacturing industry, and (3) private: services for e-commerce, online
banking, games, news and entertainment portals, connecting people services, social networks, etc. Fig. 2 shows the normalized
workload plans W = {w1 , . . . , w12 } for the three application
classes, considering the partition of a day in 12 time slots of two
hours-duration. The patterns are based on the analysis of daily
traces of real applications reported in [21] and their discretization
with respect to time slots. The absolute workload plans adopted in
our experimentation are obtained scaling the normalized ones by
fixing the peak workload value, called V .
The application performance function of each application class
is based on a queuing performance model, frequently adopted to
abstract a multi-tier application hosted in a data center [22,23].

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

(a) Acceptable region.

(b) Maximum utility.

Fig. 3. Acceptable region and maximum utility varying workload plan.

(a) Acceptable region.

(b) Maximum utility.

Fig. 4. Acceptable region and maximum utility varying available capacity.

We use an M/M/n/PS queue to capture the mean response time


reachable adopting n homogeneous virtual machines at application
server-tier receiving incoming requests from a unique queue
managed by the HTTP load balancer. In particular, indicated with
ts the mean service time of the BWC component, and with w the
mean request rate, both service time and request rate described
by a Poisson distribution, the application performance function is
given by:
t (n, w) =

1
1
ts

(24)

The minimum number of virtual machines required to reach a


given certain response time is obtained by (24) approximating the
result to the nearest integer number. For queue stability, incoming
workload w has to satisfy the following condition:

n
ts

(25)

If such condition is not satisfied, we assume the service is


unavailable. In the following for each negotiation request ts =
0.049 s.
6.2. Utility model analysis
Fig. 3 shows results of utility model evaluation considering
various workload plans within the related acceptable performance
interval. In particular, Fig. 3(a) shows the two-dimensional

acceptable region while Fig. 3(b) shows the maximum utility for
the three workload patterns and for two peak values, 15 000 and
35 000 r/s. In Fig. 3(a) we note that for proposals characterized
by workloads with V = 15 000 r/s it is possible to reach lower
response times with respect to workloads with V = 35 000 r/s.
Moreover, the maximum acceptable prices decrease by increasing
response times with a trend that is mainly influenced by the
performance model adopted in the experimentation. In particular,
for response times close to the best reachable response time (ts ),
the optimal allocation plans are high expensive (high resources
demand especially in time slots with workload near the peak) and,
consequently, prices reach high values. On the contrary, for high
response times (close to TBOUND) the maximum acceptable prices
slowly decrease towards a value that is influenced by the allocation
plan made of the minimum number of resources required in each
time slot to satisfy the queue stability condition.
As it is possible to note in Fig. 3(b), for workloads with huge
peak values, the maximum utility that potentially can be reached
is higher than the one reachable with lower peak values. This is
tied to the necessity to assign more expensive allocation plans,
that leads to higher costs and consequently higher prices and potential gains. Moreover, the maximum utility trend is increasing
in the first part, reaches a maximum and then decreases. This is
due to the impact of costs and penalty (performance degradation)
that for lower response times are huger and consequently limits
the profits. Fig. 4 shows the influence of available capacity on utility model. In particular Fig. 4(a) shows the acceptable region and

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

UC (P ), as the customer satisfaction level with respect to the


price, and the performance-based indicator, UC (T ), as the customer
satisfaction level with respect to agreed maximum response time
T and the actual response time during the contract validity period.
Denoted with UPC (P ) the provider profit perceived by the customer
with the cost awareness of the optimal allocation plan (Costopt ),
given by:
UPC (P ) = P Costopt ,

(27)

the customer satisfaction level is maximum when P is equal to


Costopt , while it is minimum when the maximum price, Pmax is
applied.
UC (P ) is defined as the normalized form of (27) as follows:
Fig. 5. Percentiles of relative error in logarithmic scale varying number of
evaluations Z .

Fig. 4(b) shows the maximum utility considering the office workload pattern with V = 35 000 r/s and a number of available virtual machines in each time slot varying from 1716 to 2000, that are
respectively the minimum number of virtual machines required to
accept the negotiation request and the maximum assignable one in
each time slot. Increasing the number of available virtual machines
in each time slot takes to lower response times withing the acceptable performance intervals and higher maximum utility value, because of more expensive allocation plans and, consequently, higher
prices and lower penalty (lower performance degradation).
The proposed utility model allows to evaluate the provisioned
utility for a certain price P and response time T within the acceptable region, adopting an interpolation technique on a finite number
(Z + 2) of exact utility evaluations. The accuracy error introduced
by this approach is mainly influenced by the non-linearity of the
utility function and by the number Z of exact evaluations. In order
to evaluate the accuracy error, we compare the provisioned utility
U (T , P ) with the exact one. The exact utility Uexact (T , P ) is defined
as the utility provisioned adopting the actual best allocation plan
assignable to a new contract. It is obtained solving optimization
problem (8) taking into account the conservative allocation policy,
SLA parameters, current capacity availability and already signed
SLAs. Given Uexact , we define the relative error, E, as:
E=

|Uexact U |
Uexact

100.

(26)

Fig. 5 shows the percentiles, from the 1-th to the 100-th, of the
relative error varying Z from 0 to 50. The samples are obtained
considering 200 values of response times uniformly distributed
within the acceptable performance interval, and for each response
time, 200 price values uniformly distributed within the respective
acceptable price interval. The reported results refer to the office
workload pattern with V = 35 000 r/s. We observe that with Z =
30 the 80% of samples reports a relative error lower than 0.6% while
the 90% of samples about 2%. More and more increasing Z , relative
error remains significative for a more and more decreasing number of samples. We can state that such results are satisfying for supporting negotiation strategies, because the utility provision of a SLA
for a new contract can tolerate a limited degree of inaccuracy for
the benefit of a feasible computational complexity.
6.3. Dynamic versus static approaches
In this section, we evaluate the effectiveness of the proposed
dynamic approach in increasing customer satisfaction level and
provider profit and reputation with respect to static approaches.
To this aim, we define two parameters, the price-based indicator,

UC (P ) =

UPC (P ) UPC (P )min


UPC (P )max UPC (P )min

(28)

(28) allows to obtain values within the interval [0, 1] for prices
within the interval [Costopt , Pmax ]. For prices less than Costopt ,
UC (P ) is greater than 1, for prices greater than Pmax it becomes negative. As a consequence, UC (P ) is adopted as an indicator that a
proposal is in the customer acceptable region and represents a potential negotiation point if its value is within the interval [0, 1].
Since in our experimentation scenario Pmax = 2 Costopt , (28)
becomes:
UC (P ) = 2

Costopt

(29)

Denoted with Tdk , d = 1, . . . , D, k = 1, . . . , K , the actual


response time in the k-th time slot of d-th day with the assigned
allocation plan, we introduce the following parameter UPC (T ), useful to represent the performance degradation perceived by the
customer:
UPC (T ) =

D
K

(T Tdk ),

d=1 k=1

Tdk =

Tdk T ,
0,

if (Tdk T ) > 0
,
if (Tdk T ) 0

(30)

d, k.

When Tdk = 0, UPC (T ) has the best value, on the contrary, it is


at minimum level, but still acceptable, when a maximum degradation level is reached. Defining such level in a proportional way to T
by means of the factor deg, UC (T ), expressed as the normalization
form of UPC (T ), assumes values within the interval [0, 1] within the
two above-mentioned boundary cases. In particular, it is given by:
UC (T ) =

UPC (T ) UPC (T )min


UPC (T )max UPC (T )min
D
K

= 1

Tdk

d=1 k=1

deg T

(31)

For performance equal or better than the agreed one, (31) is


1, for a degradation level greater than the maximum allowed, it
becomes negative and indicates a certain level of provider reputation reduction.
6.3.1. Non-additive versus additive utility model
The additive utility function, called UAMP (T , P ), adopted for
comparison with our approach, is given by a linear combination of
linear utility functions defined for each negotiation parameter and

10

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

Fig. 6. Acceptable region (AR) evaluated for the static utility model (considering
the office workload pattern and peak value V 25 000 r/s), and for the Inner
AMP and Outer AMP models, two additive multi-parameter utility models
evaluated considering as acceptable region, respectively, the rectangle inscribed
and circumscribed AR.
Table 2
Parameters of the Inner AMP and Outer AMP models.

Tmin [s]
Tmax [s]
Pmin [e]
Pmax [e]

Inner AMP

Outer AMP

0.27
2.0
6 780
10 313

0.07
2.0
5 672
24 504

normalized within the corresponding static acceptable interval of


values [4]. In particular, it is given by:
UAMP (T , P ) = wT VT (T ) + wP VP (P ),
VT ( T ) =

T Tmax
Tmax Tmin

wT + wP = 1,

VP (P ) =

P Pmax
Pmax Pmin

(32)

wT > 0, wP > 0,

where VP (P ) and VT (T ) are, respectively, the linear utility functions


with respect to price P and maximum response time T , normalized
in the related interval of acceptable values [Pmin , . . . , Pmax ] and
[Tmin , . . . , Tmax ]. For such model, the acceptable region is the
rectangle in the bi-dimensional space (T , P ), whose projection on
axes is given by the interval of acceptable values of T and P. In
this region, the utility varies linearly within the value 0 (for the
point (Tmin , Pmin )) and the value 1, gained with respect to the best
profitable value of each negotiable parameter, corresponding to
the point (Tmax , Pmax ).
The comparison was conducted considering the static evaluation of the proposed model, considering a well-defined negotiation request. In particular, such model, called static utility model,
is evaluated using parameters defined in Section 6.1, considering
a capacity availability of M = 20 000 and a negotiation request
characterized by the office workload pattern and V = 25 000 r/s.
In Fig. 6 we plot the acceptable region AR of the static utility model and of two AMP models characterized by different
acceptable regions correlated with the acceptable region (AR) of
the static utility model. For the first, called Inner AMP model, the
acceptable region corresponds to the rectangle that roughly inscribes AR. For the second, called Outer AMP model, the acceptable
region corresponds to the rectangle that roughly circumscribes AR.
Table 2 summarizes parameters that characterize the acceptable
region and utility function (32) of the two AMP models, for which
we consider the weights wP and wT both equal to 0.5.
While with the additive approach, utility has an increasing
linear trend varying response time and price, with the non-additive
approach utility has a non-linear trend, as it is possible to note in
Fig. 7, in which we compare utility evaluated with the static model

Fig. 7. Utility evaluation for the proposed non-additive model and Inner AMP and
OuterA MP models fixing price and varying response time.

and with the Inner AMP and Outer AMP models, considering two
fixed values of prices (24 504 e and 5672 e) and varying response
time.
A comparison of utility estimations for both the approaches,
considering the vertices of acceptable regions of Inner AMP and
Outer AMP models, is reported in Table 3. In particular, the table
reports: utility U provisioned by the static utility model normalized within the acceptable region AR and utility UAMP provisioned
by the AMP model related to the vertices under study; if a point
(T , P ), corresponding to a SLA proposal, is feasible or not for negotiation adopting the static utility model. Such condition, called
Potential Negotiation Point Feasibility (FEANP), is verified if the SLA
proposal is included in acceptable region AR. When a SLA proposal is external to AR, it is not included in the negotiation process
for the agreement achievement with the customer. Moreover, the
table reports Uexact , that is the effective utility evaluated adopting
the best allocation plan resulting from the optimization problem
(8), the provisioned penalty and the value of the parameter UC (P ).
Analyzing the results, we can state that the Inner AMP model
produces utility estimates more similar to the proposed approach
than the Outer AMP model. On the other hand, the Inner AMP
model corresponds to a low-risk approach, rejecting all points
characterized by high performance, with the disadvantage of
precluding the chance of high profits that can be gained with
high demanding customers, that are ready to pay much for high
performance services. On the other hand, the Outer AMP model
corresponds, on one hand, to a high-risk approach, since high
performance is offered for low prices, and on the other one, to a
out-of-market behavior, because of too high prices required for low
performance. The main estimation error for static AMP approaches
are pointed out for points (0.07 s, 5672 e) and (2 s, 24 504 e),
that, instead, are not feasible for the proposed approach, because of
respectively too low price (negative utility and Uexact ) and too high
price (utility greater than 1 and negative UC (P )). Finally, we can
conclude that, with respect to the AMP utility model, the proposed
approach has the advantage of reducing the acceptable region to
the proposals with feasible performance and competitive prices,
and so it can be effectively adopted by an integrative negotiation
strategy to quickly reach an agreement as good as possible for both
customer and provider.
6.3.2. Proposed dynamic versus static approaches
We conducted a comparative analysis between the static and
dynamic evaluation of the proposed utility model to show its capability in leading towards negotiation agreements profitable for
both provider and customer. In particular, for this experimentation we consider the static utility model, as defined in Section 6.3.1,

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

11

Table 3
Non-additive versus AMP utility model.
FEANP
!FEANP

UAMP

Uexact [e]

Pen [e]

U C (P )

!FEANP
FEANP
FEANP
!FEANP

0
0.5
0.5
1

8 965

0.04
0.23
2.48

2 376
0
10 263
0

1.54
0.90
0.00
2.75

0.05
0.18
0.51
0.64

FEANP
FEANP
FEANP
FEANP

0
0.5
0.5
1

616

0
0
0
0

0.90
0.69
0.33
0.00

T [s]

P [e]

0.07
2.0
0.07
2.0

5 672
5 672
24 504
24 504

1.19

Outer AMP

Inner AMP

0.27
2.0
0.27
2.0

6 780
6 780
10 312
10 312

that remains the same during a temporal sequence of negotiation


processes, each launched to respond to a customer request for a
VWP service negotiation. The dynamic utility model, on the contrary, corresponds to the proposed model evaluated every time a
negotiation request is received, taking into account the effective
customer requirements and current capacity availability. We consider a sequence of negotiation requests (called NRs), each characterized by the same contract validity period, starting day and
application performance features, but differentiated on the basis
of distinct workload plans (different application classes and peak
values). Since our experimentation does not focus on the evaluation of a specific negotiation strategy, we simulate the negotiation
process of a NR in the following manner: the negotiation strategy
leads to a final result, called potential negotiation point (PNP), corresponding to a point (T,P) randomly chosen in the acceptable region,
and the utility model guides the decision if the NR and the related
PNP could lead to the actual signing or not of a new contract.
In particular, given a certain NR and the related PNP, the utility
model guides the following decisions:

Negotiation Request Acceptance (NRACC): a NR is accepted and

leads to a negotiation process when it is possible to define the


acceptable region. Typically, a NR is refused when the provider
available capacity makes not convenient to negotiate for a
new contract with certain application performance features and
workload plan features;
Potential Negotiation Point Feasibility (FEANP): as already defined
in the previous section, a PNP is feasible if it is included in the
current acceptable region. If the NR is accepted and related PNP
is feasible, such PNP can be proposed by the provider to the
customer for contract stipulation, but it becomes an effective
agreement if both the following conditions are met:
Positive Price-based Indicator (POSUCP): UC (P ) 0, considering
the actual capacity availability;
Satisfied Service Availability (SSA): in each time slot the number
of available resources is enough to satisfy the service availability
condition. If such condition is not satisfied, this means that the
PNP is accepted for contract stipulation and that the provider
realizes to not be able to satisfy QoS terms only later, when the
resource management system tries to allocate resources to host
the new service.
If one of these two conditions is not satisfied, we suppose
that the contract is not stipulated at all, and that the provider
reputation is affected negatively, since the customer is not
satisfied with the provider behavior: in case of violation of the
POSUCP condition, the provider is proposing too high prices,
while in case of violation of the SSA condition, the provider
initially proposes a PNP and then refuses to agree on it.
Finally, if a contract is stipulated, the customer is satisfied
with respect to QoS guarantee terms if the following condition
is verified:
Positive Time-based Indicator (POSUCT): UC (T ) 0, considering
the actual capacity availability and deg = 0.1KD, that corresponds to an average tolerable performance degradation of 10%
with respect to the agreed level.

516
1 968
19 348
1 623
4 150
5 156

We conducted a comparative analysis mainly evaluating the


total provider utility (supposed to be zero at the beginning),
estimated by the static and the dynamic utility model at the end
of a negotiation sequence, and comparing it with Uexact . Moreover,
we evaluated the negative impact on the provider reputation in the
case the utility model is not able to prevent situations leading to
customer unsatisfaction. In particular, we evaluate the percentage
of NRs, called customer unsatisfaction percentage (CUP), for which
the customer is not satisfied, corresponding to the cases in which
the conditions NRACC and FEANP are satisfied and at least one
of the condition POSUCP, SSA or POSUCT are not satisfied. A
high value of the customer unsatisfaction percentage means that
there are many cases in which the customer is unsatisfied and so
indicates a high negative impact of the adopted utility model on the
provider reputation. If such percentage is very low, this means that
most of the negotiation requests and related negotiation points,
considered acceptable and feasible, lead to customer satisfaction.
This, as a consequence, indicates a very low negative impact of
the utility model on the provider reputation and an increase of
customer loyalty (the provider earns a higher probability of being
selected for future contract stipulations).
Experimental results refer to a sequence of 50 NRs, whose associated PNPs are spread on the acceptable region, AR, of the static
utility model. Different experiments were conducted considering different techniques to choose, randomly, the parameters T
and P of each PNP. In this paper we report the results obtained
considering for parameter T a sequence following a uniform distribution within the acceptable performance interval IntT. Such
sequence allows to simulate the stipulation of contracts with various QoS levels that, in order to be satisfied, require different capacity allocations. To each response time of the sequence of PNPs,
is associated a price P adopting a normal distribution within the
acceptable price interval related to such response time. The normal distribution for prices was chosen since it more realistically
models potential negotiation agreements between a provider and
a customer around medium prices. Table 4 reports comparison results considering four sequences of NRs, characterized by the following workload plans:
1. The workload plan of each NR is the same adopted for the static
utility model (office workload pattern, V = 25 000 r/s);
2. The workload patterns of various workload plans derive from
a discrete uniform distribution of application classes (office,
business and private) and a uniform distribution of V within the
interval [10 000 r/s, 40 000 r/s] (with av g V = 25 000 r/s);
3. like (2) but considering the interval [10 000 r/s, 30 000 r/s],
(with av g V = 20 000 r/s);
4. like (2) but considering the interval [20 000 r/s, 40 000 r/s],
(with av g V = 30 000 r/s).
In each scenario, the initial number of available virtual machines in each time slot is M = 20 000.
For the first sequence of NRs, until the number of available resources is greater or equal to the optimal one, necessary to grant

12

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

Table 4
Comparison of dynamic and static utility model for various sequences of NRs.
V [r/s]

Sta. Dyn.

U [e]

U exact [e]

Office

25 000

Sta.
Dyn.

47 720
47 720

47 911
47 911

0
0

Uniform Distr.

avg 25 000

Sta.
Dyn.

51 527
38 841

35 484
38 821

Uniform Distr.

avg 20 000

Sta.
Dyn.

58 123
55 517

Uniform Distr.

avg 30 000

Sta.
Dyn.

51 916
50 349

Workload Pattern

Pen [e]

Avg (E)

Std (E)

SLAsigned

CUP %

0.4
0.4

1.2
1.2

15
15

70
0

2723
0

72.7
0.05

76
0.1

16
16

70
0

55 626
55 626

0
0

30.4
0.14

24
0.3

18
18

64
0

29 548
50 544

1844
0

406
0.47

659
1.6

16
18

70
0

the maximum response time of the PNP, the static and dynamic
utility models give the same results. In particular, the first 15 NRs
are accepted, their related PNPs are feasible and take to contract
stipulations, the three conditions POSUCP, SSA and POSUCT are satisfied, no penalty payment are provisioned, the total provisioned
utility is the same (47 720 e) and the average value and standard
deviation of relative error E of the utility estimation for each contract is very low. After the 15-th NR, the resource occupation is 0.99
in the time slot 7 (involved by the peak workload), and remaining
36 virtual machines are not enough to grant service availability for
any other NR. Since the static utility model does not point out this
situation (it does not take into account effective capacity availability), it considers each NR further to 15-th acceptable and related
PNP feasible. As a consequence, with the static approach such NRs
lead to the violation of SSA condition, causing a high provider reputation loss (CUP = 70%). On the contrary, the dynamic utility
model, taking into account the effective capability availability, refuses from the 15-th to the last NR, since NRACC condition is not
satisfied, and avoids any negative influence on provider reputation
(CUP = 0%).
For the second sequence of NRs, regards the static approach, we
can note a huge relative error for single utility estimations (the average value is 72.7%), that causes a huge discrepancy between the
final provisioned utility (51 527 e) and the exact one (35 484 e)
at the end of the negotiation sequence. Such results are caused
by the wrong estimations performed by the static approach when
workload plans are different from the one adopted for the static
utility model evaluation. In particular, for lighter workload plans
(lower V and less demanding patterns), the static utility is less
than the exact one since the price acceptable intervals are evaluated for huger (more expensive) allocation plans than effectively
necessary to grant QoS guarantee terms, and vice versa, for huger
workload plans, it is greater that the exact one. In the first case,
the static approach can incur in the violation of POSUCP condition,
since price of PNP is too high with respect to the cost of the effectively required allocation plan and the utility becomes greater
than the maximum allowed one. In the second case, performance
degradation (and penalty), not correctly provisioned, can happen
and further influence utility estimation error. On the contrary, the
dynamic approach, taking into account the effectively required
workload plan of the NR and current capacity availability, ensures
accurate utility provisions as for the first scenario. Summarizing,
with the static approach 16 NRs lead to contract stipulations, while
the remaining ones cause: 1 violation of the POSUCT condition
(with UC (T ) = 1.82), 6 violations of the POSUCP condition, 27
violations of the SSA condition, that take to a high provider reputation degradation (CUP = 70%). In the dynamic approach, instead,
beyond to 16 contract stipulations, the remaining ones are immediately discarded for violation of FEANP condition (9 NRs) and of
NRACC condition (25 NRs), so avoiding any negative influence on
provider reputation.
The third and the fourth sequence point of NRs point out the
same advantages of the dynamic approach with respect to the

static one, already noticed for the second sequence, in a more


evident manner. For such sequences, in fact, the average value of
V is chosen so as to be, respectively, less and greater that 25 000
r/s, the value adopted for the evaluation of the static utility model.
In particular, in the third scenario, there are a lot of NRs for which
the workload plan is characterized by a peak value lower than 25
000 r/s. As a consequence, for many NRs (in particular 13 NRs)
the static approach accepts the related PNPs as feasible and causes
the violation of POSUCT condition. On the contrary, the dynamic
approach correctly discard them by means of the violation of
FEANP condition. Among the first 31 NRs, 18 of them are accepted
and considered feasible by both the approaches, but the average
estimation error is 30.4% for the static approach and 0.14% for
the dynamic one. The last 19 NRs are rejected by the dynamic
approach (because of violation of NRACC condition) since available
capacity is very scarce (91 virtual machines in the 7-th time
slot, characterized by the peak workload, after the 18th agreed
contract). On the contrary, they are accepted by the static approach
and lead to the violation of SSA condition, since such approach is
not able to detect, in advance, the lack of resources required to
satisfy QoS requirements of new contracts. Summarizing, in the
static approach there is a great customer unsatisfaction percentage
(64%), while in the dynamic approach it still remains to 0%.
In the last scenario, the utility estimation errors are very high for
the static approach (both in excess and in defect), because of highly
variable workload plans of NRs (see Table 4). In particular, Fig. 8
compares the provisions performed by the static and dynamic
approaches for the first 31 NRs. In case a NR is rejected, it is
labeled with !NRACC, while in case it is accepted and SSA condition
is satisfied, the bar of related provisioned utility is plotted and
eventual violations of other conditions are labeled. The NRs that
lead to provider reputation loss are graphically shown using the
dotted contour for the utility bar.
The figure points out as the dynamic approach is able to reject
NRs with huge workloads that lead to negative (for 1, 3, 7, 19 and
23 NR) and very low (for 6, 10 and 13 NR) utilities, because of low
prices of the related PNPs or lack of available capacity. Moreover,
the first NR, with a workload plan with business application pattern and V = 38 260 r/s, is also characterized by a low performance level (UC (T ) = 0.15) and a penalty provision of 1844 e.
Adopting the static approach the above-mentioned NRs lead to
contract stipulation, leading to an high difference between the final
utility estimation (51 916 e) and the exact one (29 548 e). With the
dynamic approach, more contracts are stipulated than the static
one (18 versus 16) and a much higher final utility (50 544 e versus 29 548 e) is reached without any performance degradation. Finally, as for the previous sequences, when the capacity availability
becomes insufficient to grant SSA condition, only the dynamic approach is able to a-priori discard further NRs, so avoiding any reputation loss. This happens for NRs from the 16-th to the last NR,
except the 24-th.

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

13

Fig. 8. Static and dynamic utility evaluation for the sequence of NRs whose workload plans are characterized by a uniform distribution of application classes and of peak
values with average of 30 000 r/s. Utility bars with dotted contour represent NRs that lead to provider reputation loss.

7. Conclusion

References

In this paper, we have proposed a technique based on capacity


planning to support Cloud providers in bilateral negotiation of
high-level QoS parameters and prices related to PaaS services.
The technique aims at achieving high satisfaction levels for both
providers and customers through a heuristic that dynamically
evaluates a non-additive utility function and the acceptable
region, by taking into account application performance, capacity
availability and a price function based on cost and market.
By adopting a queuing-based performance model and workload
patterns based on real daily workload traces, the experimental
analysis has demonstrated that the proposed solution leads the
provider to accurately predict the utility that can be gained by new
contracts in order to avoid their stipulation in case they conduct to
unprofitable revenues or customer unsatisfaction.
The current proposal aims at optimizing the provider side (even
if it takes into consideration customer satisfaction). In fact, it needs
the declaration of customer workload as a pre-condition for SLA
negotiation. The declaration of bad workload has not negative
impacts on providers satisfaction (utility), but may influence
customer utility (unfulfillment of QoS terms at customer side, if the
declared workload is under-estimated with reference to the actual
one, or unuseful costs, if the declared workload is over-estimated
with reference to the actual one).
Therefore, we are planning to investigate a progressive allocation policy and a more sophisticated approach for the dynamic definition of resource allocation plans assigned to new contracts, able
to modify the initial allocation plan, resulting from the proposed
utility optimization problem, to a better exploitation of data center resources. In particular, we will exploit machine learning algorithms to learn from monitoring data with respect to both actual
incoming workload of hosted applications and resource performance [24] in order to model customer workload profile.
We are also investigating an integrative negotiation strategy
based on time-based decision functions able to quickly reach an
agreement with high satisfaction levels for both providers and
customers.

[1] L. Wu, R. Buyya, Service level agreement SLA in utility computing systems,
in: V. Cardellini, E. Casalicchio, K. Castelo Branco, J. Estrella, F. Monaco (Eds.),
Grid and Cloud Computing: Concepts, Methodologies, Tools and Applications,
IGI Global, 2012, pp. 286310.
[2] A. Andrieux, K. Czajkowski, A. Dan, K. Keahey, H. Ludwig, T. Nakata, J. Pruyne,
J. Rofrano, S. Tuecke, M. Xu, Web Services Agreement Specification (WSAgreement), Tech. Rep., Global Grid Forum, GRAAP WG, 2005.
[3] K. Czajkowski, I. Foster, C. Kesselman, Agreement-based resource management, Proc. IEEE 93 (3) (2005) 631643. http://dx.doi.org/10.1109/JPROC.
2004.842773.
[4] H. Raiffa, The Art and Science of Negotiation, Harvard University Press, 1982.
[5] P. Wakker, Additive Representations of Preferences: A New Foundation of
Decision Analysis, Kluwer Academic Publishers, Dordrecht, Boston, London,
1989.
[6] M. Arlitt, C. Williamson, Internet Web servers: workload characterization
and performance implications, IEEE/ACM Trans. Netw. 5 (5) (1997) 631645.
http://dx.doi.org/10.1109/90.649565.
[7] A. Williams, M. Arlitt, C. Williamson, K. Barker, Web workload characterization: Ten years later, in: X. Tang, J. Xu, S. Chanson (Eds.), Web Content Delivery, in: Web Information Systems Engineering and Internet Technologies Book
Series, vol. 2, Springer US, 2005, pp. 321.
[8] N. Ranaldo, E. Zimeo, Exploiting capacity planning of cloud providers
to limit SLA violations, in: The 3rd International Conference on Cloud
Computing and Services Science, CLOSER 2013, 2013, pp. 184195.
http://dx.doi.org/10.5220/0004377001840195.
[9] N. Ranaldo, E. Zimeo, Capacity-aware utility function for SLA negotiation
of cloud services, in: IEEE/ACM 6th International Conference on Utility and
Cloud Computing, UCC 2013, Dresden, Germany, December 912, 2013, 2013,
pp. 292296. http://dx.doi.org/10.1109/UCC.2013.58.
[10] H. Li, S. Su, H. Lam, On automated e-business negotiations: Goal, policy,
strategy, and plans of decision and action, J. Org. Comput. Electron. Commer.
16 (1) (2006) 129. http://dx.doi.org/10.1080/10919390609540288.
[11] M. Chhetri, J. Lin, S. Goh, J. Yan, J.Y. Zhang, R. Kowalczyk, A coordinated
architecture for the agent-based service level agreement negotiation of Web
service composition, in: The 17th Australian Software Engineering Conference,
ASWEC 2006, 2006, pp. 9099. http://dx.doi.org/10.1109/ASWEC.2006.1.
[12] F. Zulkernine, P. Martin, An adaptive and intelligent SLA negotiation
system for Web services, IEEE Trans. Serv. Comput. 4 (1) (2011) 3143.
http://dx.doi.org/10.1109/TSC.2010.44.
[13] M. Macas, J. Guitart, Using resource-level information into nonadditive
negotiation models for cloud market environments, in: 12th IEEE/IFIP Network
Operations and Management Symposium, NOMS 2010, 2010, pp. 325332.
[14] J. Spillner, A. Schill, Dynamic SLA template adjustments based on service
property monitoring, in: IEEE International Conference on Cloud Computing,
CLOUD 2009, 2009, pp. 183189. http://dx.doi.org/10.1109/CLOUD.2009.56.
[15] F. Ren, M. Zhang, Bilateral single-issue negotiation model considering
nonlinear utility and time constraint, Decis. Support Syst. 60 (0) (2014) 2938.
Automated negotiation technologies and their applications. http://dx.doi.org/
10.1016/j.dss.2013.05.018.
[16] R. Zheng, N. Chakraborty, T. Dai, K. Sycara, M. Lewis, Automated bilateral
multiple-issue negotiation with no information about opponent, in: 2013
46th Hawaii International Conference on System Sciences (HICSS), 2013,
pp. 520527. http://dx.doi.org/10.1109/HICSS.2013.626.
[17] J. Allspaw, The Art of Capacity Planning: Scaling Web Resources, OReilly
Media, Inc., 2008.

Acknowledgment
This paper is partially supported by Italian Ministry of
Education, University and Research within the framework of PRIN
IDEAS Integrated Design and Evolution of Adaptive Systems
grant number J38C13001510001.

14

N. Ranaldo, E. Zimeo / Future Generation Computer Systems (

[18] J. Almeida, V. Almeida, D. Ardagna, C. Francalanci, M. Trubian, Resource


management in the autonomic service-oriented architecture, in: The 2006
IEEE International Conference on Autonomic Computing, ICAC 2006, 2006,
pp. 8492. http://dx.doi.org/10.1109/ICAC.2006.1662385.
[19] B. Abrahao, V. Almeida, J. Almeida, A. Zhang, D. Beyer, F. Safai, Selfadaptive SLA-driven capacity management for Internet services, in: The 10th
IEEE/IFIP Network Operations and Management Symposium, NOMS 2006,
2006, pp. 557568. http://dx.doi.org/10.1109/NOMS.2006.1687584.
[20] Y. Kouki, T. Ledoux, SLA-driven capacity planning for cloud applications,
in: 2012 IEEE 4th International Conference on Cloud Computing Technology
and Science 0, 2012, pp. 135140. http://doi.ieeecomputersociety.org/10.
1109/CloudCom.2012.6427519.
[21] A. Gandhi, Y. Chen, D. Gmach, M. Arlitt, M. Marwah, Hybrid resource provisioning for minimizing data center SLA violations and power consumption, Sustain. Comput. Inf. Syst. 2 (2012) 14. http://dx.doi.org/10.1016/j.suscom.2012.
01.005.
[22] J. Bi, Z. Zhu, R. Tian, Q. Wang, Dynamic provisioning modeling for virtualized
multi-tier applications in cloud data center, in: The 3rd IEEE International
Conference on Cloud Computing, CLOUD 2010, 2010, pp. 370377. http://dx.
doi.org/10.1109/CLOUD.2010.53.
[23] D. Menasce, M. Bennani, Autonomic virtualized environments, in: The 2006
IARIA International Conference on Autonomic and Autonomous Systems, ICAS
2006, 2006, http://dx.doi.org/10.1109/ICAS.2006.13.
[24] D. Huang, B. He, C. Miao, A survey of resource management in multitier
Web applications, IEEE Commun. Surv. Tutor. 16 (3) (2014) 15741590.
http://dx.doi.org/10.1109/SURV.2014.010814.00060.

Nadia Ranaldo received the Ph.D. degree in Computer


Science from University of Sannio, Benevento, Italy, in
2005. She is a Research Assistant in the Department of
Engineering, University of Sannio. Her main research
interests include frameworks for distributed systems,
parallel computing, wireless and sensor networks, resource management and capacity planning, Grid and Cloud
computing.

Eugenio Zimeo received the M.S. degree in Electronic


Engineering from the University of Salerno, Italy, and the
Ph.D. degree in Computer Science from the University
of Naples, Italy, in 1999. Currently, he is an Associate
Professor at the University of Sannio, Benevento, Italy. His
primary research interests include software architectures
and frameworks for distributed systems, high performance middleware, service oriented, Grid and Cloud computing, and wireless sensor networks. He has published
about 90 scientific papers in journals and conferences of
the field and heads many large research projects.

Potrebbero piacerti anche