Sei sulla pagina 1di 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/310572626

A Self-Adaptive Performance-Aware Capacity Controller in Overbooked


Datacenters

Conference Paper · September 2016


DOI: 10.1109/ICCAC.2016.8

CITATIONS READS

0 115

3 authors:

Seyed Saeid Masoumzadeh Helmut Hlavacs


University of Vienna University of Vienna
17 PUBLICATIONS   117 CITATIONS    208 PUBLICATIONS   1,276 CITATIONS   

SEE PROFILE SEE PROFILE

Luis Tomás
Umeå University
42 PUBLICATIONS   351 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Playful-consumption experience and Consumer videogame engagement In the Lens of S-R Model: An Empirical Study View project

ORBIT Business Continuity as a Servicee View project

All content following this page was uploaded by Seyed Saeid Masoumzadeh on 09 January 2018.

The user has requested enhancement of the downloaded file.


A Self-Adaptive Performance-Aware Capacity
Controller in Overbooked Datacenters
Seyed Saeid Masoumzadeh, Helmut Hlavacs Luis Tomás
Faculty of Computer Science Department of Computing Science
University of Vienna, Austria Umeå University, Sweden
Email: {seyed.saeid.masoumzadeh, helmut.hlavacs}@univie.ac.at Email: luis@cs.umu.se

Abstract—Interference between co-located VMs may lead to In this work we propose a self adaptive performance
performance fluctuations and degradation, especially in over- aware capacity controller being able to adapt isolation levels
booked datacenters. To limit this problem, VMs access to physical between high and low QoS VMs by changing the mapping
resources needs to be controlled to ensure certain degree of between virtual cpus (vcpu) and physical cpus (pcpu) – i.e.,
isolation among them. However, the mapping between virtual and the vcpu to pcpu pinning. This dynamic mapping is performed
physical resources must be performed in a dynamic way so that it
can be adapted to the changing applications requirements, as well
based on the performance of the applications running on
as to the different set of co-located VMs. To address this problem the VMs. To this end, we exploited a fuzzy reinforcement
we propose a twofold approach: (1) a Quality of Service (QoS) learning algorithm known as Fuzzy Q-Learning (FQL) [11],
scheme that provides different isolation levels for VMs with dif- which is a combination of Q-learning (a popular reinforcement
ferent QoS requirements, and (2) a self-adaptive fuzzy Q-learning learning) and fuzzy logic. This fuzzy reinforcement learning
capacity controller that proactively readjusts the isolation degree technique allows us to deal with the high complexity of
based on applications performance. Our evaluation based on real cloud systems, as well as with all their uncertainties, specially
cloud applications and workloads demonstrates that the efficient, regarding unknown future application needs. On the one hand,
adaptive mapping between VMs and physical resources reduces Q-learning builds the central core of our capacity controller. It
the interference between VMs, enabling the possibility of co- is a knowledge-free online learning process, which learns over
locating more VMs, increases overall utilization, and ensures
time (by interacting with application and getting feedback)
the performance of critical applications while providing more
resources to the low QoS applications. how to map the input states dynamically and proactively to
the output decisions for the pinning actuator in terms of the
Keywords—Cloud Computing; Fuzzy Q-Learning; Overbook- number of cores that may be shared between high and low
ing; Pinning; QoS; VM interference QoS VMs. Its main objective is to increase the shared capacity
inside the server (thus increasing utilization) while at the
I. I NTRODUCTION same time maintaining the performance of the applications
running on high QoS mode. On the other hand, thanks to the
Resource overbooking [1], [2], [3], [4] is a well known combination of fuzzy system with Q-learning, a more powerful
technique commonly used to mitigate the low resource utiliza- learning strategy is obtained, as it is capable of handling the
tion ratios reported by large datacenters, such as Google [5] curse of dimensionality in many real-world problems (such
or Amazon EC2 [6]. However, this utilization increment also as dynamic capacity allocation within cloud servers), as well
increases the possibilities of having VMs interfering with each as facilitating the merging of prior/expert knowledge into the
other, as more VMs are competing for the same resources problem, resulting in speeding up the learning process.
which may lead to delays when accessing them. This is a well
known effect known as noisy neighbor problem [7], which may In our capacity controller model, the FQL is associated to
heavily impact application performance. each high QoS demanding VM as a software agent. Each FQL
agent makes dynamic decisions about VM resource isolation
On top of this problem, not all the VMs require the needs, even though it has no information about the status of the
same Quality of Service (QoS), meaning they do not need resources at the scale of the physical host or knowledge about
to maintain the same throughput or response times over time. other co-located VMs. Then the capacity controller applies the
Furthermore, not all the VMs are equally affected by this in- isolation actions based on the FQL agents decisions.
terference [8]. For instance, a deadline constrained application
(e.g., a computationally intensive task) does not need to main- The experimental results show that our proposed fuzzy Q-
tain an average performance all the time as long as it completes learning capacity controller efficiently distributes the available
the task in time. By contrast, an interactive application may capacity between the different VMs. This in turn enables the
require a certain minimum performance all the time, as given possibility of accepting a larger amount of applications, thus
for instance by e-commerce systems. In order to treat VMs increasing overall utilization, without neither hurting low nor
with different QoS requirements differently, we extended our high quality applications’ performance.
previously developed overbooking framework [1] to include
QoS differentiation [9] (high and low QoS levels). This QoS II. BACKGROUND AND M OTIVATION
differentiation creates different isolation levels between VMs
by pinning them to specific cores inside the servers (note core Our first approach to address low utilization ratios at
pinning is a mechanism provided by KVM [10]), thus limiting cloud datacenters was a framework that performs overbooking
the impact they may have on others. decisions based on long term risk estimations [12], i.e., the
Algorithm 1 Application Core Pinning Basic  QoS  differen+a+on  
QoS  differen+a+on  plus  
server  pinning  controller  
Configuration parameters: V M s, list of running VMs
H_QoS, list of high QoS pcpus
L_QoS, list of low QoS pcpus, including empty pcpus in case of no
low QoS VMs (sorted by aggregated resource usage: CPU*Mem*IO)
1: num ← number of vpcus of new_vm
2: if new_vm requires HighQoS then
3: cores ← first num pcpus of weighted list of L_QoS
4: for each core ∈ cores do
5: Pin vcpus of new_vm to core
6: H_QoS ← H_QoS + core High  QoS  Core   Low  QoS  Core  
7: L_QoS ← L_QoS − core
Allocated  Applica+ons:    
8: end for
1  High  QoS  VM  (4  vcpus)                                          10  Low  QoS  VM  (2  vcpus  each)  
9: end if
10: for each vm with LowQoS req. ∈ V M s do  
11: //Repin all low QoS VMs 4  pcpus                                                                                                    12  pcpus  (each)  
12: Pin vm to all pcpus ∈ L_QoS
13: end for Figure 1: Virtual to physical core mapping using basic QoS
differentiation (left) and advanced QoS differentiation by using
a re-mapping controller (right).
possibility of ending up in an overload situation leading to
performance degradation. This overbooking framework was
extended with an admission control that takes decisions based
on a fuzzy risk estimation to better account for uncertainty This QoS classification however impacts the overall utiliza-
about future applications capacity needs [13], combined with tion when high QoS VMs are overprovisioned. To deal with
a proportional-integral-derivative (PID) set of controllers [14], this problem, a controller based on the high QoS applications
that adjusts the acceptable risk levels based on the deviation performance, incrementally decreases their isolation level by
of the current utilization from the target [1]. sharing some of their cores, as presented in Figure 1 (right [9]).
More specifically, at every control interval where a high QoS
Applications’ tolerance to overbooking are different, not VM is behaving better than needed, an extra physical core
only among them but also over time (depending on their allocated to it will be shared with the low QoS VMs. This
workload). This fact makes it difficult to assess the possible process keeps on as long as the target performance of the high
impact of potential overload situations. To overcome these QoS VM is maintained. Otherwise, the maximum isolation
difficulties, we firstly worked on a mitigation and recovery level is established again to avoid any possible performance
method for unexpected situations, presented in [8], where on degradation by recovering the complete isolation. As applica-
the one hand the overbooking pressure is adjusted based on the tions needs are unknown, increasing the isolation level when
behavior of the currently running VMs, and on the other hand performance degradation is observed step by step instead of
a short-term mitigation strategy based on Brownout [15] was at once may impact the application during a longer period of
used, ensuring graceful degradation (of some VMs) during load time, which in turn may lead to a longer time to recover the
spikes. This approach may reduce overall datacenter utilization desired/required performance. Note that this is performed at a
(by reducing the overbooking pressure) due to some VMs not per (high QoS) VM granularity level, i.e., there is a capacity
being able to deal with the current overbooking level. However, controller per each high QoS VM.
the source of the problem may not be a too high utilization
but rather VM interference. Therefore, in order to mitigate Although this controller presents good results regarding
this VM interference problem, also known as noisy neighbor performance isolation of the high QoS applications, it is a
problem [7], we studied the use of pinning mechanisms as a really conservative approach that only works in a reactive
way to both provide isolation between VMs inside the same manner. Therefore, it may have bigger impact in the overall
physical server, and perform QoS differentiation between VMs utilization ratios than desired, specially with high fluctuating
based on this isolation capacity [9]. Note that KVM core workloads. Furthermore, it may also affect the low QoS
pinning is proposed in [10] as a feature needed to ensure applications due to the abrupt (and more frequent) reductions
reliable and repeatable performance. in the amount of cores that they are entitled to use, as high
QoS applications recover the isolation on all the cores at once.
In this work we use the KVM core pinning functionality
to offer two different QoS levels (or isolation levels) by To address this problem we propose a self-adaptive con-
performing virtual cpu (vcpu) to physical cpu (pcpu) pinning troller exploiting an online machine learning approach to effi-
using Algorithm 1, based on the QoS schema depicted in ciently learn decision policies in terms of isolation level, (i.e.,
Figure 1 (left): the number of cores to share) with regard to the applications
behavior. Our approach is based on a particular reinforce-
• High QoS: application vcpus are pinned to pcpus and ment learning technique called Fuzzy Q-Learning (FQL) [16]
get exclusive access to them – no other VM vcpus can which is a combination of Q-Learning (QL) and fuzzy logic,
be pinned to these pcpus. detailed in the next section. In our proposed controller the
QL algorithm, as a decision making engine, learns how to
• Low QoS: application vcpus are not pinned to any map the input states to the desired output decisions (here,
pcpus and can use any pcpus except the ones booked the number of shared cores) to maximize the shared capacity
for high QoS applications. over time, helping to increase utilization while maintaining
the performance of the application in terms of the response Fuzzifica1on  
Layer   Rule  Evalua1on    
time. As QL is a table-driven learning algorithm, using fuzzy Input  Layer     Layer  
logic can facilitate it to deal with large or continues state L1  
q1  
��
Q  
space resulting in handling high dimensionality. In addition,

…  

…  
Input  State  
using fuzzy logic can enable knowledge encapsulation into the qm  

…  
learning table, i.e. to merge a prior or expert knowledge into

…  
X  

…  
the problem. o1  

…  
��

…  
Lm   om   a  
III. F UZZY Q-L EARNING M ODEL
Reinforcement learning (RL) [17] comprises a set of
machine learning methods designed to address a particular Figure 2: FQL Structure.
learning task in which an agent is placed in an unknown
environment and is allowed to take actions/make decisions
which can change its state in the environment and bring learning table resulting in speeding up the learning process.
it delayed numerical rewards. The goal of the agent is to In FQL, the decision making part is represented by a Fuzzy
learn a policy that tells it what action/decision to take/make Inference System (FIS) that considers continuous/large discrete
in order to maximize the cumulative reward over time. Q- states as input. The idea of the FQL algorithm is to use a so-
learning (QL) [17] is a popular RL algorithm that represents called q-table as a compact version of the Q-table to represent
the learning knowledge by means of a Q-table, whose Q- learning knowledge.
values are defined for each state-action pair, determining the In FQL, the FIS is represented by a set of rules J with a
expected cumulative reward that can be received by taking that rule j ∈ J defined as
action in that state and following an optimal policy afterwards.
Note that, after the learning process, an optimal policy can be IF(x1 is L1j ) . . . AND (xn is Lnj ) . . . AND (xN is LN
j )
constructed by simply selecting the action with highest value
THEN a = oj with q(Lj , oj ). (2)
in each state. The RL problem is modeled as follows: The
state vector x ∈ S is composed of values of representative Lnj is a fuzzy label of the input variable xn of the state
variables capturing the surrounding environment, being S = vector x = [x1 , . . . , xn , . . . , xN ] participating in the j th rule,
{s1 , . . . , sn } the set of possible states the agent can perceive and oj = [o1j , . . . , okj , . . . , oK
j ] is the output action set of the
from the environment. The set of actions A = {a1 , . . . , al }
j th rule. The vector Lj = [L1j , . . . , Lnj , . . . , LNj ] is called the
represents the decisions that the agent can make based on the
state vector x. Based on x and the corresponding Q-values, modal vector corresponding to the rule j. q(Lj , okj ) is called
the most suitable action a ∈ A is selected and executed. After the q-value function of state Lj and action okj of the j th rule.
the execution of a in x, the agent receives an immediate scalar In the Fuzzification Layer of FIS (see Figure 2) each
reward r, and the corresponding Q-value Q(x, a) is updated membership function (µL ) maps a state component into the
by the temporal-difference (TD) [17] method according to the degree of membership to a fuzzy set corresponding to a given
following rule (the subscript t is added to highlight the time label. Let Jx denote the set of all rules in the Rule Evaluation
dependency in the update equation): Layer of FIS (see Figure 2). The membership of a state
Qt+1 (xt , at ) = Qt (xt , at )+ vector x, or the degree of truth in the fuzzy logic terminology
(represented by α), with respect to the j th rule, j ∈ Jx is
β[rt+1 + γ maxa� Qt (xt+1 , a� ) − Qt (xt , at )]. (1)
defined as the product of the corresponding member functions
Here rt+1 is the observed reward for selecting action at when of the rule:
�N
observing state vector xt , and β is the learning rate with
0 ≤ β ≤ 1, where high values result in quick learning αj (x) = µLnj (xn ). (3)
and adaptation, while low values prevent too quick changes n=1
due to rare outliers. The term maxa� Qt (xt+1 , a� ) denotes the In the FQL algorithm, we have a two-level action selection. In
estimated optimal value of a future state, and the discount the first level (we call it local action selection) a set of actions
factor γ, with 0 ≤ γ ≤ 1, determines whether the optimization is chosen according to an Exploration/Exploitation (EE) policy.
should only consider current rewards (γ = 0) or whether it An EE policy allows the agent to explore untried actions to
should strive for more long-term rewards (γ = 1). gain more experience, and combine this with exploitation of
The most significant drawback of Q-Learning is that it the already known successful actions to ensure high long-term
cannot be used in case that the state space is large, a situation reward. The �-greedy method [18] is used as the EE policy
found in many real world problems, since it uses mainly for our experimental studies. With �-greedy, at each time step,
large memory for saving Q-table (for keeping the track of the agent selects a random action with a fixed probability 1 −
state-action quality values). Apart from it, even if the system �, where � is typically chosen close to and below 1, instead
provides such large memory, the learning agent needs a lots of selecting greedily one of the learned optimal actions with
of trials and episodes to learn desired behavior resulting in respect to the q-table:

increasing the cost of learning. with Prob. � ∀ j ∈ Jx : olj = argmaxk∈K q(Lj , okj )
Fuzzy Q-learning (FQL) [11] is a fuzzy extension of the 
Q-learning algorithm being able to overcome this problem. In with Prob. 1 − � ∀ j ∈ Jx : olj = randomk∈K (okj )
addition, it allows us to encapsulate expert knowledge into the (4)
Where olj is the selected local action of the j th rule. In the Server X
second level of the action selection (we call it inferred action (1)
selection) a nominated action for input vector x is selected FQL agent High QoS VM
from the set of local actions as follow:

Capacity Controller
a = max αj (x)olj . (5)
j∈Jx (2)

Low QoS VMs


Let us add time index to our equation to highlight time
(2)
dependency from now on. The approximation of the quality
value of the state xt is calculated based on the following
equation:
� FQL agent High QoS VM
Q(xt , a) = αj (xt ) × qt (Lj , olj ). (6) (1)

j∈Jxt
Figure 3: Architecture overview.
After taking action a the system goes to the next state xt+1
and observes the reward rt+1 . The state value for the input
vector xt+1 is calculated as follow: on the FQL agents at each high QoS VM, as depicted in
� Figure 3.
V (xt+1 ) = αj (xt+1 ) × max qt (Lj , okj ). (7)
k
j∈Jxt+1 Owing to the learning capacity, the FQL agents can predict
the applications behavior once they gathered enough knowl-
Based on the equation (6) and (7) the temporal-difference (TD) edge about them, and thanks to that they can pro-actively
error is calculated as follows: increase/decrease their exclusive access to resources before the
�Q = rt+1 + γ V (xt+1 ) − Q(xt , a). (8) performance degradation happens. What is more, as it tries
to reduce the isolation level as much as possible, it enables
Here γ again being the discount factor. Finally the q-function the possibility of sharing more resources to the (low QoS)
is updated for each activated rule j ∈ Jxt according to the co-located VMs as it is aware about the next states of the
rule: application behavior in the future. For example, lets imagine a
qt+1 (Lj , olj ) = qt (Lj , olj ) + β αj (xt )�Q (9) situation where a 8-cores high QoS VM needs more than 2 but
less than 3 cores in isolation at the present. In this situation,
where β is the learning rate, as in Equation 1 for QL.
the approach presented in [9] would increase the amount of
shared cores one by one (i.e., 0, 1, 2, 3, 4, 5, and 6) up to the
IV. F UZZY Q-L EARNING C APACITY C ONTROLLER point where the number of cores shared is too high and the
An efficient capacity sharing between VMs co-located in application has problems to keep up its performance. Then,
the same server is achieved if enough capacity is provided to the complete isolation would be recovered and the process
all the VMs (based on the QoS-level), and at the same time a would start again (from 0 to 6 shared cores). This behavior may
high utilization ratio is achieved. On the one hand, utilization impact the low QoS applications due to the abrupt changes in
is increased by maximizing the amount of cores that high the number of cores they are allowed to use, as well as it would
QoS VMs share with low QoS VMs, reducing the former’s reduce the chances for higher utilization and/or accepting more
isolation level. On the other hand, the isolation level needs to VMs. By contrast, thanks to the learning behavior of the FQL
be enough so that high QoS performance is not affected due agent, it realizes (after a few of iterations) that sharing 5 cores
to the capacity shared with lower QoS VMs. Consequently, is the right option. Now imagine the application running on
this trade-off needs to be discovered and updated over time the VM suddenly demands more resources. In this situation the
as it will be different not only for VMs mixture, but also for controller presented in [9] would again recover the complete
different workload patterns. isolation, while with the FQL approach the number of isolated
core will proactively change to the new required isolation level,
It is also important to make isolation level decision durable, even before the application needs it. These are just simple
i.e., to avoid constant and abrupt changes into the amount of examples, but more complex application behavior patterns can
cores being shared by the high QoS VMs as this will impact be discovered and learned by the FQL agents leading to a
all the co-located VMs. On the one hand, it will impact the better resource sharing over time.
low QoS VMs due to the sudden fluctuations in the available
capacity they are entitled to use. On the other hand, it will
impact the other co-located high QoS VMs, since, even though A. FQL components
high QoS VMs do not share cores among themselves, the low
QoS VMs will have higher needs of the cores being shared The FQL agents are associated to the VMs running the
by the rest high QoS applications, therefore having an impact high QoS applications. The components of an FQL agent are
in their performance, which in turn will lead to even more described in the following:
fluctuations in the number of cores they shared.
• State: The combination of the current response time
Due to the benefits of Fuzzy Q-Learning, we present an and the current isolation level in terms of number of
approach where the isolation level of the high QoS VMs, i.e., shared cores describe each state inside a high QoS
the amount of cores they share with the low QoS VMs, is VM. These two characteristics have enough informa-
managed by a Capacity Controller whose decisions are based tion to represent a predictable application behavior.
Therefore, the input state vector to the FQL is defined

700
as

600
Usage (User requests per second)
(10)

500
xt = [rtt , nct ],

400
where rtt and nct denote the current response time
and the current number of isolated cores respectively.

300
200
• Action: Each element of the action set o denotes the
number of cores needed in isolation, for transiting

100
RUBBoS
RUBiS
from current state to the next state. Accumulated

0
0 200 400 600 800 1000 1200 1400
Time (Minutes)
o = [nc1 , nc2 , ..., nci ]. (11)
Figure 4: Workloads for RUBiS VMs.
• Reward Function: The design of a reward function
is the key to build reinforcement learning system.
It maps each perceived state-action pair of the en-
vironment to a single number, a reward, indicating the VM again. In addition, the learning controller receives a
the intrinsic desirability of that state-action. The goal punishment proportional with the amount of exceedance (see
of the reinforcement learning is to maximize the Eq. (12) where rt > tr) once it calls the reactive controller.
cumulative reward over time. This is obtained if the Therefore, the learning controller also learns how to act to
learning agent seeks actions that result the highest q- decrease the number of calls over time.
value. In our system, the reinforcement learning tries
to optimize the trade-off between sharing capacity and V. E XPERIMENTS
the performance over time, so the reward function can
return a value determining a success rate, here maxi- The performance of our proposed capacity allocation con-
mizing the number of shared cores and minimizing troller for QoS assurance and differentiation based on FQL
the response time: agents is evaluated next. The tests are conducted on two
machines, one hosting applications and one generating the
rt+1 = (1 − nct /tc) − (rtt /tr), (12) workload, connected through a 1 GB link. The first machine
is a server consisting of a total of 32 cores (AMD OpteronTM
where tc is the total number of physical cores assigned 6272 at 2.1 GHz) and 56 GB of memory. KVM was used as
to a high QoS VM, and tr is the target response time. a hypervisor and each application was deployed inside a VM.
It is obvious that if the high QoS VM controller shares The second machine is a 4-core (Intel CoreTM i5 processor
its cores as much as possible (nc ⇒ 0) while keeping at 3.4 GHz) desktop with 16 GB of memory. Note that as
the response time as far below as possible from the the capacity allocation controller works at server level, this is
target response time (rt ⇒ 0) it will receive more straightforwardly extensible to any number of servers.
reward. If the reward is close to zero this implies that
the action is not effective and a negative reward is
considered as punishment for the learning agent. A. Applications and Workload
We have emulated a representative cloud workload by
B. Let reactive and proactive be friends mixing different types of VMs that can be grouped in two main
As FQL agents learn by trial and error interaction with classes (similarly to the boulders and sand scheme reported
the environment, it is likely to degrade the performance due by Google in [5]). The first VM class is represented by
to either random actions in an exploration mode or lacking (usually big) long-living VMs, running interactive applications
enough learning knowledge in its Q-table during the learning that usually present some seasonality pattern in their use.
process, i.e., under new situations never experienced. The For this application class we have used the RUBiS [19] and
learning agents are making decisions in terms of the number of RUBBoS [20] cloud benchmarks, which are an auction website
isolated cores for the high QoS VMs, where a response time benchmark modeled after eBay and Slashdot, respectively. For
higher than the target threshold is crucial in terms of SLA the experiments we consider a fixed set of them in each run: 2
violations. VMs requesting half of the server capacity (8 vcpus and 14 GB
memory each). Note that as they are interactive applications,
Our proposed controller can guarantee the performance they are submitted with the high QoS requirement, as they can
for the high QoS VMs after the learning process but during be more affected by other co-located VMs than the deadline
the learning or under completely new situations there is no oriented applications, similarly to the scenario presented in [5],
guarantee. Although, combining the Q-learning with the fuzzy [21]. As regards to the workload, a number of queries have
logic can speedup the learning process (as stated in the been generated using information extracted from different days
previous section), such performance degradation may not be of the Wikipedia traces [22], and time-shifted one of them 12
acceptable for the high QoS users even for a short-term. To hours, as shown in Figure 4, creating different trends, peaks,
address this problem, our proposed learning controller calls and daily usage patters. The client queries were generated
a reactive controller to act on its behalf once the response using the httpmon tool1 .
time exceeds the set target threshold. In this situation the
reactive controller establishes the maximum isolation level for 1 https://github.com/cloud-control/httpmon
1
The second VM class consists of different types of rel-

Membership   Degree
atively short living, non-interactive VMs. Firstly, in order to
increase the uncertainty in the system, and thus creating a more
realistic aggregated workload, we created different VMs that 0
run shell scripts that consume random amounts of CPU and 50 100 150 200 250 300
Response   Time
350 400 450 500

memory over time. This creates a set of VMs with highly


(a)
heterogeneous and time-varying resource requirements, with
both bursty and steady behavior in their resource consumption. 1

Membership   Degree
On the other hand, to have a measurable performance of
this class of VMs, we have also created different types of
VMs that continuously solve random sudokus2 and report their
corresponding throughput achieved over time. Additionally, 0
50 100 150 200 250 300 350 400 450 500
these sudoku VMs can be differentiated by their two main Response Time

behaviors: (b)

• SudokuBE: solves a certain amount of sudokus before 1

Membership   Degree
a given deadline. This type of VM behaves in a best
effort manner, trying to solve as many sudokus as
possible and therefore using as much CPU time as
available. 0
0 1 2 3 4 5 6 7 8
• SudokuT: keeps a certain throughput (number of Number  of  Cores
solved sudokus) over time. If during one time period (c)
the target is not achieved, the sudokus queue up, and
need to be solved in the next period (i.e., open-loop). Figure 5: Fuzzy sets. (a) For the response time of the RUBBoS
We mix two different SudokuT types, with a target of VMs, (b) For the response time of the RUBiS VMs, (c) For
11 and 22 solved sudokus per second, respectively. the number of cores of both RUBBoS and RUBiS VMs.

The arrival pattern of these second type of VMs is gen-


erated using a Poisson distribution with λ = 20 seconds. worth noting that, the fuzzy sets can also be tuned at run time
This leads to a final workload consisting of 2 high QoS VMs with an alternative approach [23]. As Figure 5 shows we have 5
represented by the web servers, and a varying number of low fuzzy sets for the response time and 9 fuzzy sets for the number
QoS VMs consisting on a mix of the different Sudoku and of cores which provides (9 × 5) 45 rules in our Fuzzy Inference
shell scripts VMs. Note the resulting overall workload stresses System (FIS). Note the rules are considered as the states for
the CPU usage most, therefore we evaluated this capacity the Q-learning. The benefit of using FIS is obvious here as
dimension in the next section. Besides, the submission ratio much less states are generated for the Q-learning algorithm.
is high enough to keep the system always saturated, stressing As a consequence, the learning process can be accelerated and
the controller reaction to an extreme case with high chances the space complexity is decreased – the space complexity in
of VMs interference. Q-learning is always O(S × A), where S is the number of
states and A is the number of actions. The value of � in our
B. FQL settings EE policy has been set to 0.9 (as stated before, it is typically
chosen close to and below 1), meaning that the agent acts
In the FQL setup, the discount factor γ has been set to randomly with the probability of 0.1 and enforces a learned
0.5. This value for the discount factor allows the learning action with the probability of 0.9. It helps our system combats
agent to consider the current and long-term rewards equally. overfitting. The FQL iteration for perceiving the next state and
This would be a proper value for our case study where we receiving the reward for the previous action has been set to 30
would like to be a little bit conservative in terms of taking seconds. Furthermore, the Q-table has been initialized by zero
actions while the future is also important for us. The learning in order to start with zero learning knowledge.
rate β changes over time. Whenever the controller receives a
reward the learning rate is set to 0.1 and whenever it receives C. Performance evaluation
a punishment it is set to 1.0. This guarantees that “bad” news
travels faster than “good” news enabling the learning agent To evaluate the performance improvement achieved thanks
response to the performance violation rapidly. We applied to our Fuzzy Q-Learning (FQL) based capacity management
trapezoidal and triangular shaped membership functions as controller, a comparison with the performance achieved by the
shown in Figure 5 for the response time of the RUBiS and controller implemented in [9], labeled as CC6, is presented
RUBBoS VMs and also applied triangular shaped membership next. As explained at Section II, the controller proposed in [9],
functions for the number of cores. Note that the design of unlike our current proposal, behaves in a conservative, reactive
the fuzzy sets for the input parameters is based on our prior manner, progressively increasing the amount of cores being
knowledge about the workload of RUBiS and RUBBoS VMs, shared if the performance is maintained, and recovering the
presenting a more conservative approach for RUBBoS as its complete isolation at once when there is problems to maintain
behavior is less linear with regards to the number of users. It is it. On the one hand, we measure the average and 95-percentile
response time for the RUBiS and RUBBoS VMs in order to
2 http://norvig.com/sudoku.html ensure that high QoS applications performance is maintained.
Average Response Time Average Response Time
95 Percentile Response Time 95 Percentile Response Time
Shared Cores Shared Cores

8
CC6
600

600
8

8
FQL
500

500

6
6

6
Response Time (ms)

Response Time (ms)

Shared Cores
400

400
Shared Cores

Shared Cores

4
300

300
4

4
200

200

2
2

2
100

100

0
0

0
0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400 0 0.1 0.3 0.5 0.7 0.9 1

Time (min.) Time (min.) Normalized Time

(a) CC6 (b) FQL (c) Shared Cores Summary (1st day)
Figure 6: RUBiS performance (1st day).

On the other hand, we evaluate the overall utilization and the RUBBoS workload peak. Not only the overall response
number of shared cores over time, as well as the number time is reduced, but also the fluctuations, both in number and
of sudoku VMs concurrently running. The throughput size. These differences are even more remarkable at the third
achieved by the sudoku VMs is also measured to ensure that day (Figure 7c and Figure 8c), where thanks to the learning,
low QoS applications performance is acceptable, i.e., they keep the FQL approach remarkably reduces the response times, as
a specific throughput over time and/or meet their deadlines. well as presents a more constant behavior compared to CC6.
Finally, in order to conclude whether there is an evidence Again, the response time reduction is statistically significant,
of statistically significant improvement in relation with any not only for the third day but also for the complete experiment
of the previously described metrics, we use the Wilcoxon (Wilcoxon p-value: 2.2e-16) As in Figure 6, it can be clearly
statistical test [24], a non-parametric statistical hypothesis test seen a much more stable number of cores shared over time for
that compares two related samples to assess whether their the FQL approach.
population mean ranks differ.
Regarding the learning process over time, Figure 8 clearly
Figure 6 shows the performance comparison between CC6 highlights the learning of the FQL-based controller, taking
and FQL as regards to the RUBiS application performance. advantage of the gathered knowledge about the RUBBoS daily
As Figures 6a and 6b depict, there is no remarkable difference patters, and making response time fluctuations much lower
at the response time since the RUBiS VM presents a linear over time. It must also be noted that, due to the transitions
performance regarding the number of request, which facilitates between proactive and reactive mechanisms, the performance
to keep the response times within the desired limits. However, is acceptable during the three days. The FQL calls the reactive
it may be highlighted that the trend for FQL is slightly flatter controller to act on its behalf when response time is not good
due to sharing a more stable number of cores over time. In fact, and there is not enough learning knowledge yet gathered. That
the Wilcoxon statistical test concludes that the reduction on the is the reason for the short peaks during the first days. Note
response time (both average and 95th) is statistically significant these peaks also happen in CC6 approach. Due to the high load,
(p-value < 5.0e-07). It can be seen at Figure 6c that the amount the actuation needs to be done in a proactive way, otherwise
of cores shared over time is pretty similar for both approaches even by not sharing any of the cores (as CC6 does), the VM
(FQL shares slightly more cores), sharing all the cores around needs some time to recover the already queued requests. We
half of the time for both approaches. However, as higlighted can see at Figure 8c that, even though the response time was
by Figures 6a and 6b, the number of cores shared over time already good during the second day, it is even improved during
quickly and largely fluctuates for CC6, while remains more the third day by sharing a slightly less amount of cores, in
stable for our approach. favour of a more constant and predictable behavior. If we
analyze together the sharing behavior during the third day, for
Figure 7 and Figure 8 show the same comparison between both FQL agents, i.e., for RUBiS and RUBBoS applications
CC6 and FQL but for the RUBBoS application, which presents (see Figure 8c and Figure 9b) and compared then with the
a less linear behavior with respect to the number of requests behavior obtained for CC6 (see Figure 7c and Figure 9a), we
per second. In this case, as the performance fluctuates more, can appreciate that thanks to the RUBBoS FQL agent sharing
we show its evolution over three days, to also highlight the a less amount of cores, the RUBBoS application present a
learning of the FQL approach. Although the performance better behavior than before. What is more, as both FQL
is reasonable good for the first day for both approaches agents become aware of the system evolution over time, the
(Figure 7a and Figure 8a), with slights improvements for RUBiS FQL agent is able to shared a larger amount of cores
the FQL, the performance is improved day by day for the (remarkable different compared with CC6 for RUBiS during
FQL approach. In Figure 7b and Figure 8b the differences the third day – Figure 9a) as the RUBiS VM is more easy
between them are noticeable, with an overall reduction of to control – more linear behavior regarding the number of
the response times, specially towards the end coinciding with request per second than the RUBBoS VM. Consequently, the
Average Response Time Average Response Time Average Response Time
95 Percentile Response Time 95 Percentile Response Time 95 Percentile Response Time
Shared Cores Shared Cores Shared Cores
600

600

600
8

8
500

500

500
6

6
Response Time (ms)

Response Time (ms)

Response Time (ms)


400

400

400
Shared Cores

Shared Cores

Shared Cores
300

300

300
4

4
200

200

200
2

2
100

100

100
0

0
0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400
Time (min.) Time (min.) Time (min.)

(a) 1st Day (b) 2nd Day (c) 3rd Day


Figure 7: RUBBoS Performance (CC6).

Average Response Time Average Response Time Average Response Time


95 Percentile Response Time 95 Percentile Response Time 95 Percentile Response Time
Shared Cores Shared Cores Shared Cores
600

600

600
8

8
500

500

500
6

6
Response Time (ms)

Response Time (ms)

Response Time (ms)


400

400

400
Shared Cores

Shared Cores

Shared Cores
300

300

300
4

4
200

200

200
2

2
100

100

100
0

0
0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400
Time (min.) Time (min.) Time (min.)

(a) 1st Day (b) 2nd Day (c) 3rd Day


Figure 8: RUBBoS Performance (FQL).

controllers learned that it was better to be more aggressive number of cores shared by both FQL agents can be seen in
in the sharing policy for RUBiS and more conservative for Figure 11c, where the boxplot clearly highlights the smaller
RUBBoS, as performance deviations in the former can be fixed variation in the overall number of cores shared over time, as
quickly just by removing a few shared cores. well as the higher average. Additionally, the evolution over
time is depicted at Figure 12, where the smaller fluctuations
All in all, a much more stable sharing for both FQL agents, over time are also clearly highlighted. Once again, there is
with less and smaller fluctuations, is achieved. Additionally, as statistical proof of both FQL agents sharing more cores than
depicted in Figure 10, the FQL approach is also able to share the CC6 approach, with Wilcoxon p-values of 1.98e-07 and
more cores over time, specially reducing the amount of time 2.2e-16 for RUBiS and RUBBoS, respectively. Additionally,
where no cores were shared for the RUBBoS VM (from 30% a 35% reduction on the standard deviation of the amount of
of the time to just 10%), enabling the option to accept more cores shared over time was achieved by our FQL controller.
VMs and therefore taking better advantage of the available
resources. As shown in previous figures, the self-learning and self-
adaptive FQL agents are able to share more cores and in a
In addition to the learning process highlighted at the more stable way over time even without knowledge about
RUBBoS response times, Figure 11 shows the differences the status of the infrastructure (i.e., the server). Consequently,
between the amount of cores shared by the capacity controller the FQL-based capacity controller presents a more efficient
for the complete three days, for both RUBiS and RUBBoS, sharing mechanism, where more cores are shared, and in turn
highlighting the FQL agents learning over time. For RUBiS, more VMs can be accepted. This is highlighted in Figures 13a
as the performance was good all the time, there are no large and 13b, where a histogram for the amount of sudoku VMs
variations among days. However, for RUBBoS we can see the concurrently running is depicted. The histogram for FQL is
controller clearly shares fewer cores for the last day (mainly shifted to the right compared to the CC6 approach. Only
due to the peak during the last part of the day) based on the the first bars are larger for CC6, which means there are
knowledge acquired during the previous days, leading to an fewer concurrently running sudoku VMs during longer period
improved performance as shown in Figure 8c. The aggregated of time. By contrast, all the bars for the higher amount
16
8

8
1st 1st
2nd 2nd

14
3rd 3rd
6

12
Shared Cores.
Shared Cores

Shared Cores

10
4

8
6
2

4
0

2
0 0.1 0.3 0.5 0.7 0.9 1 0 0.1 0.3 0.5 0.7 0.9 1 CC6 FQL
Normalized Time Normalized Time

(a) RUBiS (FQL) (b) RUBBoS (FQL) (c) Aggregated cores shared
Figure 11: Cores Sharing: Learning over time.

Average Response Time


95 Percentile Response Time
Shared Cores

8
CC6
600

FQL
500

6
6
Response Time (ms)

Shared Cores
400

Shared Cores

4
300

4
200

2
2
100

0
0

0 0.1 0.3 0.5 0.7 0.9 1


0 200 400 600 800 1100 1400
Normalized Time
Time (min.)

(a) CC6 Figure 10: Shared Cores: RUBBoS (3 days).


Average Response Time
95 Percentile Response Time
Shared Cores CC6
FQL
600

16
500

14
Aggregated shared cores
6
Response Time (ms)
400

12
Shared Cores
300

10
4

8
200

6
100

4
0

0 200 400 600 800 1100 1400


0 400 800 1400 2000 2600 3200 3800
Time (min.)
Time (min.)
(b) FQL
Figure 12: Cores sharing over time.
Figure 9: RUBiS Performance (3rd day).

sudoku VM type (Sudoku10, Sudoku20 and SudokuBE) over


of concurrently running sudoku VMs (from 7 onwards) are
time. As it can be seen, both Sudoku10 and Sudoku20 fulfill
larger for FQL, which means that most of the time there are
the 11 and 22 solved sudokus per second throughput require-
more sudoku VMs concurrently running when using FQL.
ment during all the experiment. Regarding SudokuBE, the
In addition, a summary boxplot is presented in Figure 13c,
number of sudokus solved per second need to be (on average)
highlighting a 5% improvement in the number of concurrently
over 35 sudokus/second to meet application deadlines. This is
running VMs within an already overbooked system. Thanks
clearly achieved during the complete experiment, with average
to that increment in the number of running VMs, the overall
performance well above 35 sudokus per second. Moreover,
utilization also rises as shown by Figure 14, where the average
there is statistical proof based on the Wilcoxon test that the
utilization along the 3 days experiments is depicted.
average performance of SudokuBE is improved compared to
Finally, Figure 15 shows the average performance per CC6 (p-value: 0.00039). These results highlight the fact that
Histogram of Concurrent Sudoku VMs Histogram of Concurrent Sudoku VMs
250

250

16
200

200

14
Number of Sudoku VMs
12
150

150
Frequency

Frequency

10
100

100

8
6
50

50

4
0

2
0 5 10 15 0 5 10 15 CC6 FQL

Concurrent Sudoku VMs Concurrent Sudoku VMs

(a) Histogram CC6 (b) Histogram FQL (c) Number of concurrent sudoku VMs overtime

Figure 13: Sudoku VMs: number of VMs and performance.

VI. R ELATED W ORK


85

CC6
FQL Although resource overbooking has been previously stud-
ied [25], there are still no complete solutions that make
80

an holistic management of all the steps needed to perform


a safe overbooking, such as admission control, mitigating
Utilization (%)
75

VMs interference, QoS differentiation, fairness performance


degradation, etc. In the literature we find, among many others,
70

the work proposed by Urgaonkar et al. [26] where a feedback


control approach to safely overbook cluster resources is pre-
65

sented, guaranteeing applications performance, but assuming


that users are capable of providing information regarding the
60

0 6 12 20 28 36 44 52 60 68 overbooking tolerance of their applications. As previously


Time (hours.) studied in [1] [8], this is strongly coupled to the underlying
physical infrastructure, as well as to the collocated applica-
Figure 14: Overall Utilization. tions. There are other works also focusing on the admission
control problem, such as the ones presented in [2] [3] and [1].
However, they do not present a complete solution, and once the
Sudoku VMs Average Thoughput (Sudokus per Second)

applications are accepted, there is no mechanism to deal with


50

possible resource shortages due to flash crowds, mispredictions


or VMs interference.
40

Sudoku 10 There are also works focusing on detecting, mitigating,


Sudoku 20
30

Sudoku BestEffort and/or avoiding VMs interference, such as [4], [27], [28],
or [29]. However, their solutions are either based on detecting
20

the overloaded/interference problem and migrating VMs to


alleviate the problem, or categorizing the VMs to avoid possi-
10

ble co-locations, instead of trying to adapt to the applications


behavior and thus reduce possible VMs interference over time.
Following that autonomic approach based on current applica-
0

0 400 1000 1700 2400 3100 3800


tions needs, as well as to mitigate the impact of unexpected
Time (min.)
situations, we presented in [8] a feedback controller approach
Figure 15: Sudoku Performance (FQL). that self-optimizes the overbooking pressure based on running
VM behavior, in cooperation with an application performance
steering approach that ensures graceful degradation during load
spikes. However, once again the focus is on mitigating the
even if more VMs are concurrently running, their performance problem first and trying to avoid it in the near future, instead
is not affected at all due to the larger amount of cores being of directly reducing the VMs interference. Consequently it has
shared between high and low QoS VMs over time, as well an impact on overall resource utilization.
as due to fewer and less abrupt fluctuations in the transitions
from sharing to not sharing cores, i.e., thanks to the more Regarding VM co-location and interference problems, the
stable sharing policy. work presented in [30] concluded that queuing delays, schedul-
ing delays and load imbalances all significantly impact perfor- approach and these approaches is that they do not consider the
mance. An interesting approach is presented by Delimitrou VMs interference problem and therefore decisions are based
et al. [31]. They propose a scheduler that classifies and on the amount of physical resources needed (e.g., number of
allocates the incoming applications based on their profiled and cores) and not in how to use/share them more efficiently.
expected interference with other already running VMs. Note By contrast, our approach considers the VMs interference
that our work focuses on the VM isolation inside a single problem and decides on the level of isolation needed among
server, hence both approaches could complement each other. VMs that are sharing the same resources, thus VMs make
A similar idea is presented in [7], where VMs are pinned to dynamic decisions about its own resources to make the overall
specific cores in order to avoid VMs interference. The pinning performance better, helping to increase the overall utilization.
decisions are based on affinity values estimated by a fuzzy With this respect, Rao et al. [35] presented a reinforcement
logic programming engine. Additionally, Nathuji et at. present learning approach for virtual machine auto configuration in
the Q-Cloud [32] scheduling approach that deals with VMs cloud environments. In this work, the RL agent learns how
interference by allocating some extra resource capacity to the to change the configuration of the VMs in terms of memory
VMs needing it to meet their SLAs. However, unlike our work, size, number of virtual CPUs, and scheduler credit (running
they are not targeting an overbooked environment. Therefore on a host node) with respect to the applications demands to
they assume there is always enough unused capacity in the maximize overall performance. In [36], the authors proposed
server that can be used for helping the VMs suffering from a parallel reinforcement learning approach for the problem of
the interference problem. There are similar works presented auto scaling in IaaS cloud environment. Unlike our work that
by Lo et al. [21] and Leverich et al. [30], where a set of is based on the capacity management within the server, they
hierarchical controllers and sub-controllers are used to both focus on learning agents that can learn optimal scaling policies
rise utilization but also ensure that best effort applications in terms of add, remove or maintain the amount of virtual
will not hurt latency-sensitive applications. However, they take machines allocated to the applications, i.e., in a different server.
the correction actions in a reactive way, based on the current In [37] the authors proposed a fuzzy q-learning algorithm for
situation while our approach takes proactive actions and learns the problem of auto-scaling. The FQL agent learns how to
over time from past behavior for each allocated latency- increment or decrement the number of virtual machines as
sensitive (high QoS) application. Besides, unlike our approach, a scaling action to meet a performance requirement for a
they always prioritize the performance of the latency-sensitive given user. RL approaches have been also employed in the
applications, regardless of how far/close they are from their context of dynamic virtual machine consolidation. In [38] a
KPIs. cooperative multi agent reinforcement learning approach has
been proposed for the problem of physical nodes management
With the advent of autonomic computing, exploiting con- for dynamic virtual machine consolidation. In the proposed
trol theory, machine learning and soft computing techniques to strategy, each learning agent in cooperation with other agents,
construct adaptive systems (software that monitors and modi- learns when and which virtual machine must be selected to
fies its own behavior to meet goals) has been always attractive migrate so that the energy-performance tradeoff optimization
in this area. Fuzzy logic as a soft computing technique has been can be achieved inside datacenter.
exploited by researchers to inject human intuitive knowledge
into controllers in different domains. The weakness of these VII. C ONCLUSIONS
controllers is that they can not be sophisticated enough in
VMs performance can be affected by other co-located
a highly dynamic environment as the human knowledge is
VMs, the problem being known as the noisy neighbor. This
not always available or accurate at design-time. With this
problem is specially aggravated in overbooked systems as
respect, in [33] the authors proposed an adaptive resource
more VMs are competing for the same physical resources.
allocation controller for virtualized datacenters by combining
In order to alleviate this problem, some isolation techniques
a fuzzy controller with a clustering algorithm. Although using
may be used, such as cgroups or KVM core pining. By using
a clustering algorithm as an extension to the fuzzy controller
them, different isolation levels can be created even inside a
allowed it acted without requiring a-priori knowledge, it is
single server, therefore enabling the possibility of providing
still inefficient in a highly dynamic environment as it needs
QoS differentiation by pinning some VMs to specific cores
to run clustering algorithm every time new data is received
exclusively, while others will share a group of them.
due to the changing workload. Reinforcement learning (RL)
algorithms amongst other approaches have been popular in this These isolation levels may lead to low utilization ratios if
field. This popularity lies in the fact that they do not need any high QoS VMs are overprovisioned. To solve this problem,
knowledge or training data which usually create a bottleneck to the extra capacity allocated to the high QoS VMs can be
design adaptive systems dealing with a dynamic/time-variant used by other lower QoS VMs, as long as the high QoS
environment. In fact, RL can generate the knowledge to make VMs are not affected. In this paper we present a Fuzzy Q-
the system adaptive through learning over time. It is worth Learning based capacity controller that dynamically decides
mentioning that, RL has been also appealing for researchers how much capacity each high QoS VM can share with low
in artificial intelligence field to improve the learning speed. For QoS VMs, i.e., how much the isolation level between them
instance, we can point to deep reinforcement learning [34] as is reduced. Thanks to the mix between reactive and proactive
one of the most recent efforts to improve reinforcement learn- actions, as well as due to the learning over time, a more stable
ing algorithm. Reinforcement learning has been employed sev- performance is achieved, where more low QoS VMs can be
eral times in similar problems to ours, related to the problem processed without affecting the high QoS VMs performance,
of dynamic resource allocation in IaaS cloud computing, such consequently making a more efficient use of the available
as [35] [36] [37] [38] . The significant different between our resources and increasing the overall utilization.
We plan to extend the current work with the use of cgroups [18] M. Tokic and G. Palm, “Value-difference based exploration: adaptive
to ensure not only CPU isolation but also I/O and memory control between epsilon-greedy and softmax,” in KI 2011: Advances in
isolation. In addition, we plan to investigate techniques from Artificial Intelligence. Springer, 2011, pp. 335–346.
fuzzy logic induction to be able to automatically configure [19] RUBiS: Rice University Bidding System, Web page at http://rubis.ow2.
org/, Visited 2013-11-4.
the fuzzy sets without relying on previous knowledge about
[20] RUBBoS: Bulletin Board Benchmark, Web page at http://jmob.ow2.org/
applications behavior. rubbos.html, Visited 2013-11-4.
[21] D. Lo, L. Cheng, R. Govindaraju, P. Ranganathan, and C. Kozyrakis,
“Heracles: Improving resource efficiency at scale,” in Annual Interna-
ACKNOWLEDGEMENTS tional Symposium on Computer Architecture (ISCA), 2015, pp. 450–462.
This collaborative research was initiated with an STSM granted [22] Page view statistics for Wikimedia projects, Web page at http://dumps.
by IC1304 COST-ACROSS action. This work was partially supported wikimedia.org/other/pagecounts-raw/, Visited 2013-03-13.
by the Swedish Research Council (VR) for the project Cloud Control. [23] M. J. Er and C. Deng, “Online tuning of fuzzy inference systems using
We would like to thank Pooyan Jamshidi for insighted comments that dynamic fuzzy q-learning,” IEEE Transactions on Systems, Man, and
have helped improve the paper. Cybernetics, Part B (Cybernetics), vol. 34, no. 3, pp. 1478–1489, 2004.
[24] F. Wilcoxon, “Individual Comparisons by Ranking Methods,” Biomet-
rics Bulletin, vol. 1, no. 6, pp. 80–83, 1945.
R EFERENCES [25] R. Householder, S. Arnold, and R. Green, “On cloud-based oversub-
scription,” International Journal of Engineering Trends and Technology
[1] L. Tomás and J. Tordsson, “An autonomic approach to risk-aware data (IJETT), vol. 8, no. 8, pp. 425–431, 2014.
center overbooking,” IEEE Transactions on Cloud Computing, vol. 2,
no. 3, pp. 292–305, 2014. [26] B. Urgaonkar, P. Shenoy, and T. Roscoe, “Resource overbooking and
application profiling in shared hosting platforms,” in OSDI, 2002, pp.
[2] D. Breitgand, Z. Dubitzky, A. Epstein, O. Feder, A. Glikson, I. Shapira, 239–254.
and G. Toffetti, “An adaptive utilization accelerator for virtualized
environments,” in IEEE Intl. Conference on Cloud Engineering (IC2E), [27] N. Bobroff, A. Kochut, and K. Beaty, “Dynamic placement of virtual
2014, pp. 165–174. machines for managing sla violations,” in 10th IFIP/IEEE Intl. Sympo-
sium on Integrated Network Management (IM), 2007, pp. 119–128.
[3] R. Ghosh and V. K. Naik, “Biting Off Safely More Than You Can
Chew: Predictive Analytics for Resource Over-Commit in IaaS Cloud,” [28] T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif, “Sandpiper:
in 5th Intl. Conference on Cloud Computing, 2012, pp. 25–32. Black-box and gray-box resource management for virtual machines,”
Computer Networks, vol. 53, no. 17, pp. 2923–2938, 2009.
[4] A. Beloglazov and R. Buyya, “Managing overloaded hosts for dynamic
[29] X. Zhang, E. Tune, R. Hagmann, R. Jnagal, V. Gokhale, and J. Wilkes,
consolidation of virtual machines in cloud data centers under quality of
“CPI2: CPU performance isolation for shared compute clusters,” in
service constraints,” IEEE TPDS, vol. 24, no. 7, pp. 1366–1379, 2013.
SIGOPS European Conference on Computer Systems (EuroSys), 2013,
[5] C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch, pp. 379–391.
“Heterogeneity and dynamicity of clouds at scale: Google trace analy-
[30] J. Leverich and C. Kozyrakis, “Reconciling high server utilization and
sis,” in 3rd ACM Symposium on Cloud Computing (SoCC), 2012.
sub-millisecond quality-of-service,” in 9th European Conference on
[6] H. Liu, “A measurement study of server utilization in public clouds,” Computer Systems (EuroSys), 2014.
in IEEE 9th Intl. Conference on Dependable, Autonomic and Secure
[31] C. Delimitrou and C. Kozyrakis, “Quasar: Resource-Efficient and QoS-
Computing (DASC), 2011, pp. 435–442.
Aware Cluster Management,” in Intl. Conference on Architectural
[7] L. Tomás, C. Vázquez, J. Tordsson, and G. Moreno, “Reducing noisy- Support for Programming Languages and Operating Systems, 2014.
neighbor impact with a fuzzy affinity-aware scheduler,” in International [32] R. Nathuji, A. Kansal, and A. Ghaffarkhah, “Q-clouds: Managing
Conference on Cloud and Autonomic Computing (ICCAC), 2015, pp. performance interference effects for qos-aware clouds,” in Eurosys,
33–44. 2010.
[8] L. Tomás, C. Klein, J. Tordsson, and F. Hernández-Rodríguez, “The [33] J. Xu, M. Zhao, J. Fortes, R. Carpenter, and M. Yousif, “On the use of
straw that broke the camel’s back: safe cloud overbooking with appli- fuzzy modeling in virtualized data center management,” in Autonomic
cation brownout,” in International Conference on Cloud and Autonomic Computing, 2007. ICAC’07. Fourth International Conference on. IEEE,
Computing Conference (ICCAC), 2014. 2007, pp. 25–25.
[9] L. Tomás and J. Tordsson, “Cloud Service Differentiation in Over- [34] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.
booked Data Centers,” in IEEE Conference on Utility and Cloud Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski
Computing (UCC), 2014. et al., “Human-level control through deep reinforcement learning,”
[10] IBM Knowledge Centre, “Kernel Virtual Machine (KVM): Best Nature, vol. 518, no. 7540, pp. 529–533, 2015.
practices for KVM,” Web page at http://www-01.ibm.com/support/ [35] J. Rao, X. Bu, C.-Z. Xu, L. Wang, and G. Yin, “Vconf: a reinforcement
knowledgecenter/linuxonibm/liaat/liaatbestpractices_pdf.pdf?lang=en. learning approach to virtual machines auto-configuration,” in Proceed-
[11] P. Y. Glorennec and L. Jouffe, “Fuzzy q-learning,” in Intl. Conference ings of the 6th international conference on Autonomic computing.
on Fuzzy Systems, vol. 2, 1997, pp. 659–662. ACM, 2009, pp. 137–146.
[12] L. Tomás and J. Tordsson, “Improving Cloud Infrastructure Utilization [36] E. Barrett, E. Howley, and J. Duggan, “Applying reinforcement learning
through Overbooking,” in Cloud and Autonomic Computing Conference towards automating resource allocation and application scalability in
(CAC). ACM, 2013. the cloud,” Concurrency and Computation: Practice and Experience,
[13] L. Tomás and J. Tordsson, “Cloudy with a chance of load spikes: Ad- vol. 25, no. 12, pp. 1656–1674, 2013.
mission control with fuzzy risk assessments,” in Proc. of 6th IEEE/ACM [37] P. Jamshidi, A. M. Sharifloo, C. Pahl, A. Metzger, and G. Estrada, “Self-
Intl. Conference on Utility and Cloud Computing, 2013, pp. 155–162. learning cloud controllers: Fuzzy q-learning for knowledge evolution,”
[14] K. J. Åström and R. M. Murray, Feedback Systems: An Introduction for in Cloud and Autonomic Computing (ICCAC), 2015 International
Scientists and Engineers. Princeton University Press, 2008. Conference on. IEEE, 2015, pp. 208–211.
[15] C. Klein, M. Maggio, K.-E. Årzén, and F. Hernández-Rodríguez, [38] S. S. Masoumzadeh and H. Hlavacs, “A cooperative multi agent learning
“Brownout: Building more robust cloud applications,” in 36th Intl. approach to manage physical host nodes for dynamic consolidation of
Conference on Software Engineering, 2014. virtual machines,” in IEEE Symposium on Network Cloud Computing
and Applications, 2015, pp. 43–50.
[16] P. Y. Glorennec, “Reinforcement learning: an overview,” in European
Sym. on Intelligent Techniques, 2000.
[17] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction.
MIT press, 1998.

View publication stats

Potrebbero piacerti anche