Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
net/publication/310572626
CITATIONS READS
0 115
3 authors:
Luis Tomás
Umeå University
42 PUBLICATIONS 351 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Playful-consumption experience and Consumer videogame engagement In the Lens of S-R Model: An Empirical Study View project
All content following this page was uploaded by Seyed Saeid Masoumzadeh on 09 January 2018.
Abstract—Interference between co-located VMs may lead to In this work we propose a self adaptive performance
performance fluctuations and degradation, especially in over- aware capacity controller being able to adapt isolation levels
booked datacenters. To limit this problem, VMs access to physical between high and low QoS VMs by changing the mapping
resources needs to be controlled to ensure certain degree of between virtual cpus (vcpu) and physical cpus (pcpu) – i.e.,
isolation among them. However, the mapping between virtual and the vcpu to pcpu pinning. This dynamic mapping is performed
physical resources must be performed in a dynamic way so that it
can be adapted to the changing applications requirements, as well
based on the performance of the applications running on
as to the different set of co-located VMs. To address this problem the VMs. To this end, we exploited a fuzzy reinforcement
we propose a twofold approach: (1) a Quality of Service (QoS) learning algorithm known as Fuzzy Q-Learning (FQL) [11],
scheme that provides different isolation levels for VMs with dif- which is a combination of Q-learning (a popular reinforcement
ferent QoS requirements, and (2) a self-adaptive fuzzy Q-learning learning) and fuzzy logic. This fuzzy reinforcement learning
capacity controller that proactively readjusts the isolation degree technique allows us to deal with the high complexity of
based on applications performance. Our evaluation based on real cloud systems, as well as with all their uncertainties, specially
cloud applications and workloads demonstrates that the efficient, regarding unknown future application needs. On the one hand,
adaptive mapping between VMs and physical resources reduces Q-learning builds the central core of our capacity controller. It
the interference between VMs, enabling the possibility of co- is a knowledge-free online learning process, which learns over
locating more VMs, increases overall utilization, and ensures
time (by interacting with application and getting feedback)
the performance of critical applications while providing more
resources to the low QoS applications. how to map the input states dynamically and proactively to
the output decisions for the pinning actuator in terms of the
Keywords—Cloud Computing; Fuzzy Q-Learning; Overbook- number of cores that may be shared between high and low
ing; Pinning; QoS; VM interference QoS VMs. Its main objective is to increase the shared capacity
inside the server (thus increasing utilization) while at the
I. I NTRODUCTION same time maintaining the performance of the applications
running on high QoS mode. On the other hand, thanks to the
Resource overbooking [1], [2], [3], [4] is a well known combination of fuzzy system with Q-learning, a more powerful
technique commonly used to mitigate the low resource utiliza- learning strategy is obtained, as it is capable of handling the
tion ratios reported by large datacenters, such as Google [5] curse of dimensionality in many real-world problems (such
or Amazon EC2 [6]. However, this utilization increment also as dynamic capacity allocation within cloud servers), as well
increases the possibilities of having VMs interfering with each as facilitating the merging of prior/expert knowledge into the
other, as more VMs are competing for the same resources problem, resulting in speeding up the learning process.
which may lead to delays when accessing them. This is a well
known effect known as noisy neighbor problem [7], which may In our capacity controller model, the FQL is associated to
heavily impact application performance. each high QoS demanding VM as a software agent. Each FQL
agent makes dynamic decisions about VM resource isolation
On top of this problem, not all the VMs require the needs, even though it has no information about the status of the
same Quality of Service (QoS), meaning they do not need resources at the scale of the physical host or knowledge about
to maintain the same throughput or response times over time. other co-located VMs. Then the capacity controller applies the
Furthermore, not all the VMs are equally affected by this in- isolation actions based on the FQL agents decisions.
terference [8]. For instance, a deadline constrained application
(e.g., a computationally intensive task) does not need to main- The experimental results show that our proposed fuzzy Q-
tain an average performance all the time as long as it completes learning capacity controller efficiently distributes the available
the task in time. By contrast, an interactive application may capacity between the different VMs. This in turn enables the
require a certain minimum performance all the time, as given possibility of accepting a larger amount of applications, thus
for instance by e-commerce systems. In order to treat VMs increasing overall utilization, without neither hurting low nor
with different QoS requirements differently, we extended our high quality applications’ performance.
previously developed overbooking framework [1] to include
QoS differentiation [9] (high and low QoS levels). This QoS II. BACKGROUND AND M OTIVATION
differentiation creates different isolation levels between VMs
by pinning them to specific cores inside the servers (note core Our first approach to address low utilization ratios at
pinning is a mechanism provided by KVM [10]), thus limiting cloud datacenters was a framework that performs overbooking
the impact they may have on others. decisions based on long term risk estimations [12], i.e., the
Algorithm 1 Application Core Pinning Basic
QoS
differen+a+on
QoS
differen+a+on
plus
server
pinning
controller
Configuration parameters: V M s, list of running VMs
H_QoS, list of high QoS pcpus
L_QoS, list of low QoS pcpus, including empty pcpus in case of no
low QoS VMs (sorted by aggregated resource usage: CPU*Mem*IO)
1: num ← number of vpcus of new_vm
2: if new_vm requires HighQoS then
3: cores ← first num pcpus of weighted list of L_QoS
4: for each core ∈ cores do
5: Pin vcpus of new_vm to core
6: H_QoS ← H_QoS + core High
QoS
Core
Low
QoS
Core
7: L_QoS ← L_QoS − core
Allocated
Applica+ons:
8: end for
1
High
QoS
VM
(4
vcpus)
10
Low
QoS
VM
(2
vcpus
each)
9: end if
10: for each vm with LowQoS req. ∈ V M s do
11: //Repin all low QoS VMs 4
pcpus
12
pcpus
(each)
12: Pin vm to all pcpus ∈ L_QoS
13: end for Figure 1: Virtual to physical core mapping using basic QoS
differentiation (left) and advanced QoS differentiation by using
a re-mapping controller (right).
possibility of ending up in an overload situation leading to
performance degradation. This overbooking framework was
extended with an admission control that takes decisions based
on a fuzzy risk estimation to better account for uncertainty This QoS classification however impacts the overall utiliza-
about future applications capacity needs [13], combined with tion when high QoS VMs are overprovisioned. To deal with
a proportional-integral-derivative (PID) set of controllers [14], this problem, a controller based on the high QoS applications
that adjusts the acceptable risk levels based on the deviation performance, incrementally decreases their isolation level by
of the current utilization from the target [1]. sharing some of their cores, as presented in Figure 1 (right [9]).
More specifically, at every control interval where a high QoS
Applications’ tolerance to overbooking are different, not VM is behaving better than needed, an extra physical core
only among them but also over time (depending on their allocated to it will be shared with the low QoS VMs. This
workload). This fact makes it difficult to assess the possible process keeps on as long as the target performance of the high
impact of potential overload situations. To overcome these QoS VM is maintained. Otherwise, the maximum isolation
difficulties, we firstly worked on a mitigation and recovery level is established again to avoid any possible performance
method for unexpected situations, presented in [8], where on degradation by recovering the complete isolation. As applica-
the one hand the overbooking pressure is adjusted based on the tions needs are unknown, increasing the isolation level when
behavior of the currently running VMs, and on the other hand performance degradation is observed step by step instead of
a short-term mitigation strategy based on Brownout [15] was at once may impact the application during a longer period of
used, ensuring graceful degradation (of some VMs) during load time, which in turn may lead to a longer time to recover the
spikes. This approach may reduce overall datacenter utilization desired/required performance. Note that this is performed at a
(by reducing the overbooking pressure) due to some VMs not per (high QoS) VM granularity level, i.e., there is a capacity
being able to deal with the current overbooking level. However, controller per each high QoS VM.
the source of the problem may not be a too high utilization
but rather VM interference. Therefore, in order to mitigate Although this controller presents good results regarding
this VM interference problem, also known as noisy neighbor performance isolation of the high QoS applications, it is a
problem [7], we studied the use of pinning mechanisms as a really conservative approach that only works in a reactive
way to both provide isolation between VMs inside the same manner. Therefore, it may have bigger impact in the overall
physical server, and perform QoS differentiation between VMs utilization ratios than desired, specially with high fluctuating
based on this isolation capacity [9]. Note that KVM core workloads. Furthermore, it may also affect the low QoS
pinning is proposed in [10] as a feature needed to ensure applications due to the abrupt (and more frequent) reductions
reliable and repeatable performance. in the amount of cores that they are entitled to use, as high
QoS applications recover the isolation on all the cores at once.
In this work we use the KVM core pinning functionality
to offer two different QoS levels (or isolation levels) by To address this problem we propose a self-adaptive con-
performing virtual cpu (vcpu) to physical cpu (pcpu) pinning troller exploiting an online machine learning approach to effi-
using Algorithm 1, based on the QoS schema depicted in ciently learn decision policies in terms of isolation level, (i.e.,
Figure 1 (left): the number of cores to share) with regard to the applications
behavior. Our approach is based on a particular reinforce-
• High QoS: application vcpus are pinned to pcpus and ment learning technique called Fuzzy Q-Learning (FQL) [16]
get exclusive access to them – no other VM vcpus can which is a combination of Q-Learning (QL) and fuzzy logic,
be pinned to these pcpus. detailed in the next section. In our proposed controller the
QL algorithm, as a decision making engine, learns how to
• Low QoS: application vcpus are not pinned to any map the input states to the desired output decisions (here,
pcpus and can use any pcpus except the ones booked the number of shared cores) to maximize the shared capacity
for high QoS applications. over time, helping to increase utilization while maintaining
the performance of the application in terms of the response Fuzzifica1on
Layer
Rule
Evalua1on
time. As QL is a table-driven learning algorithm, using fuzzy Input
Layer
Layer
logic can facilitate it to deal with large or continues state L1
q1
��
Q
space resulting in handling high dimensionality. In addition,
…
…
Input
State
using fuzzy logic can enable knowledge encapsulation into the qm
…
learning table, i.e. to merge a prior or expert knowledge into
…
X
…
the problem. o1
…
��
…
Lm
om
a
III. F UZZY Q-L EARNING M ODEL
Reinforcement learning (RL) [17] comprises a set of
machine learning methods designed to address a particular Figure 2: FQL Structure.
learning task in which an agent is placed in an unknown
environment and is allowed to take actions/make decisions
which can change its state in the environment and bring learning table resulting in speeding up the learning process.
it delayed numerical rewards. The goal of the agent is to In FQL, the decision making part is represented by a Fuzzy
learn a policy that tells it what action/decision to take/make Inference System (FIS) that considers continuous/large discrete
in order to maximize the cumulative reward over time. Q- states as input. The idea of the FQL algorithm is to use a so-
learning (QL) [17] is a popular RL algorithm that represents called q-table as a compact version of the Q-table to represent
the learning knowledge by means of a Q-table, whose Q- learning knowledge.
values are defined for each state-action pair, determining the In FQL, the FIS is represented by a set of rules J with a
expected cumulative reward that can be received by taking that rule j ∈ J defined as
action in that state and following an optimal policy afterwards.
Note that, after the learning process, an optimal policy can be IF(x1 is L1j ) . . . AND (xn is Lnj ) . . . AND (xN is LN
j )
constructed by simply selecting the action with highest value
THEN a = oj with q(Lj , oj ). (2)
in each state. The RL problem is modeled as follows: The
state vector x ∈ S is composed of values of representative Lnj is a fuzzy label of the input variable xn of the state
variables capturing the surrounding environment, being S = vector x = [x1 , . . . , xn , . . . , xN ] participating in the j th rule,
{s1 , . . . , sn } the set of possible states the agent can perceive and oj = [o1j , . . . , okj , . . . , oK
j ] is the output action set of the
from the environment. The set of actions A = {a1 , . . . , al }
j th rule. The vector Lj = [L1j , . . . , Lnj , . . . , LNj ] is called the
represents the decisions that the agent can make based on the
state vector x. Based on x and the corresponding Q-values, modal vector corresponding to the rule j. q(Lj , okj ) is called
the most suitable action a ∈ A is selected and executed. After the q-value function of state Lj and action okj of the j th rule.
the execution of a in x, the agent receives an immediate scalar In the Fuzzification Layer of FIS (see Figure 2) each
reward r, and the corresponding Q-value Q(x, a) is updated membership function (µL ) maps a state component into the
by the temporal-difference (TD) [17] method according to the degree of membership to a fuzzy set corresponding to a given
following rule (the subscript t is added to highlight the time label. Let Jx denote the set of all rules in the Rule Evaluation
dependency in the update equation): Layer of FIS (see Figure 2). The membership of a state
Qt+1 (xt , at ) = Qt (xt , at )+ vector x, or the degree of truth in the fuzzy logic terminology
(represented by α), with respect to the j th rule, j ∈ Jx is
β[rt+1 + γ maxa� Qt (xt+1 , a� ) − Qt (xt , at )]. (1)
defined as the product of the corresponding member functions
Here rt+1 is the observed reward for selecting action at when of the rule:
�N
observing state vector xt , and β is the learning rate with
0 ≤ β ≤ 1, where high values result in quick learning αj (x) = µLnj (xn ). (3)
and adaptation, while low values prevent too quick changes n=1
due to rare outliers. The term maxa� Qt (xt+1 , a� ) denotes the In the FQL algorithm, we have a two-level action selection. In
estimated optimal value of a future state, and the discount the first level (we call it local action selection) a set of actions
factor γ, with 0 ≤ γ ≤ 1, determines whether the optimization is chosen according to an Exploration/Exploitation (EE) policy.
should only consider current rewards (γ = 0) or whether it An EE policy allows the agent to explore untried actions to
should strive for more long-term rewards (γ = 1). gain more experience, and combine this with exploitation of
The most significant drawback of Q-Learning is that it the already known successful actions to ensure high long-term
cannot be used in case that the state space is large, a situation reward. The �-greedy method [18] is used as the EE policy
found in many real world problems, since it uses mainly for our experimental studies. With �-greedy, at each time step,
large memory for saving Q-table (for keeping the track of the agent selects a random action with a fixed probability 1 −
state-action quality values). Apart from it, even if the system �, where � is typically chosen close to and below 1, instead
provides such large memory, the learning agent needs a lots of selecting greedily one of the learned optimal actions with
of trials and episodes to learn desired behavior resulting in respect to the q-table:
increasing the cost of learning. with Prob. � ∀ j ∈ Jx : olj = argmaxk∈K q(Lj , okj )
Fuzzy Q-learning (FQL) [11] is a fuzzy extension of the
Q-learning algorithm being able to overcome this problem. In with Prob. 1 − � ∀ j ∈ Jx : olj = randomk∈K (okj )
addition, it allows us to encapsulate expert knowledge into the (4)
Where olj is the selected local action of the j th rule. In the Server X
second level of the action selection (we call it inferred action (1)
selection) a nominated action for input vector x is selected FQL agent High QoS VM
from the set of local actions as follow:
Capacity Controller
a = max αj (x)olj . (5)
j∈Jx (2)
j∈Jxt
Figure 3: Architecture overview.
After taking action a the system goes to the next state xt+1
and observes the reward rt+1 . The state value for the input
vector xt+1 is calculated as follow: on the FQL agents at each high QoS VM, as depicted in
� Figure 3.
V (xt+1 ) = αj (xt+1 ) × max qt (Lj , okj ). (7)
k
j∈Jxt+1 Owing to the learning capacity, the FQL agents can predict
the applications behavior once they gathered enough knowl-
Based on the equation (6) and (7) the temporal-difference (TD) edge about them, and thanks to that they can pro-actively
error is calculated as follows: increase/decrease their exclusive access to resources before the
�Q = rt+1 + γ V (xt+1 ) − Q(xt , a). (8) performance degradation happens. What is more, as it tries
to reduce the isolation level as much as possible, it enables
Here γ again being the discount factor. Finally the q-function the possibility of sharing more resources to the (low QoS)
is updated for each activated rule j ∈ Jxt according to the co-located VMs as it is aware about the next states of the
rule: application behavior in the future. For example, lets imagine a
qt+1 (Lj , olj ) = qt (Lj , olj ) + β αj (xt )�Q (9) situation where a 8-cores high QoS VM needs more than 2 but
less than 3 cores in isolation at the present. In this situation,
where β is the learning rate, as in Equation 1 for QL.
the approach presented in [9] would increase the amount of
shared cores one by one (i.e., 0, 1, 2, 3, 4, 5, and 6) up to the
IV. F UZZY Q-L EARNING C APACITY C ONTROLLER point where the number of cores shared is too high and the
An efficient capacity sharing between VMs co-located in application has problems to keep up its performance. Then,
the same server is achieved if enough capacity is provided to the complete isolation would be recovered and the process
all the VMs (based on the QoS-level), and at the same time a would start again (from 0 to 6 shared cores). This behavior may
high utilization ratio is achieved. On the one hand, utilization impact the low QoS applications due to the abrupt changes in
is increased by maximizing the amount of cores that high the number of cores they are allowed to use, as well as it would
QoS VMs share with low QoS VMs, reducing the former’s reduce the chances for higher utilization and/or accepting more
isolation level. On the other hand, the isolation level needs to VMs. By contrast, thanks to the learning behavior of the FQL
be enough so that high QoS performance is not affected due agent, it realizes (after a few of iterations) that sharing 5 cores
to the capacity shared with lower QoS VMs. Consequently, is the right option. Now imagine the application running on
this trade-off needs to be discovered and updated over time the VM suddenly demands more resources. In this situation the
as it will be different not only for VMs mixture, but also for controller presented in [9] would again recover the complete
different workload patterns. isolation, while with the FQL approach the number of isolated
core will proactively change to the new required isolation level,
It is also important to make isolation level decision durable, even before the application needs it. These are just simple
i.e., to avoid constant and abrupt changes into the amount of examples, but more complex application behavior patterns can
cores being shared by the high QoS VMs as this will impact be discovered and learned by the FQL agents leading to a
all the co-located VMs. On the one hand, it will impact the better resource sharing over time.
low QoS VMs due to the sudden fluctuations in the available
capacity they are entitled to use. On the other hand, it will
impact the other co-located high QoS VMs, since, even though A. FQL components
high QoS VMs do not share cores among themselves, the low
QoS VMs will have higher needs of the cores being shared The FQL agents are associated to the VMs running the
by the rest high QoS applications, therefore having an impact high QoS applications. The components of an FQL agent are
in their performance, which in turn will lead to even more described in the following:
fluctuations in the number of cores they shared.
• State: The combination of the current response time
Due to the benefits of Fuzzy Q-Learning, we present an and the current isolation level in terms of number of
approach where the isolation level of the high QoS VMs, i.e., shared cores describe each state inside a high QoS
the amount of cores they share with the low QoS VMs, is VM. These two characteristics have enough informa-
managed by a Capacity Controller whose decisions are based tion to represent a predictable application behavior.
Therefore, the input state vector to the FQL is defined
700
as
600
Usage (User requests per second)
(10)
500
xt = [rtt , nct ],
400
where rtt and nct denote the current response time
and the current number of isolated cores respectively.
300
200
• Action: Each element of the action set o denotes the
number of cores needed in isolation, for transiting
100
RUBBoS
RUBiS
from current state to the next state. Accumulated
0
0 200 400 600 800 1000 1200 1400
Time (Minutes)
o = [nc1 , nc2 , ..., nci ]. (11)
Figure 4: Workloads for RUBiS VMs.
• Reward Function: The design of a reward function
is the key to build reinforcement learning system.
It maps each perceived state-action pair of the en-
vironment to a single number, a reward, indicating the VM again. In addition, the learning controller receives a
the intrinsic desirability of that state-action. The goal punishment proportional with the amount of exceedance (see
of the reinforcement learning is to maximize the Eq. (12) where rt > tr) once it calls the reactive controller.
cumulative reward over time. This is obtained if the Therefore, the learning controller also learns how to act to
learning agent seeks actions that result the highest q- decrease the number of calls over time.
value. In our system, the reinforcement learning tries
to optimize the trade-off between sharing capacity and V. E XPERIMENTS
the performance over time, so the reward function can
return a value determining a success rate, here maxi- The performance of our proposed capacity allocation con-
mizing the number of shared cores and minimizing troller for QoS assurance and differentiation based on FQL
the response time: agents is evaluated next. The tests are conducted on two
machines, one hosting applications and one generating the
rt+1 = (1 − nct /tc) − (rtt /tr), (12) workload, connected through a 1 GB link. The first machine
is a server consisting of a total of 32 cores (AMD OpteronTM
where tc is the total number of physical cores assigned 6272 at 2.1 GHz) and 56 GB of memory. KVM was used as
to a high QoS VM, and tr is the target response time. a hypervisor and each application was deployed inside a VM.
It is obvious that if the high QoS VM controller shares The second machine is a 4-core (Intel CoreTM i5 processor
its cores as much as possible (nc ⇒ 0) while keeping at 3.4 GHz) desktop with 16 GB of memory. Note that as
the response time as far below as possible from the the capacity allocation controller works at server level, this is
target response time (rt ⇒ 0) it will receive more straightforwardly extensible to any number of servers.
reward. If the reward is close to zero this implies that
the action is not effective and a negative reward is
considered as punishment for the learning agent. A. Applications and Workload
We have emulated a representative cloud workload by
B. Let reactive and proactive be friends mixing different types of VMs that can be grouped in two main
As FQL agents learn by trial and error interaction with classes (similarly to the boulders and sand scheme reported
the environment, it is likely to degrade the performance due by Google in [5]). The first VM class is represented by
to either random actions in an exploration mode or lacking (usually big) long-living VMs, running interactive applications
enough learning knowledge in its Q-table during the learning that usually present some seasonality pattern in their use.
process, i.e., under new situations never experienced. The For this application class we have used the RUBiS [19] and
learning agents are making decisions in terms of the number of RUBBoS [20] cloud benchmarks, which are an auction website
isolated cores for the high QoS VMs, where a response time benchmark modeled after eBay and Slashdot, respectively. For
higher than the target threshold is crucial in terms of SLA the experiments we consider a fixed set of them in each run: 2
violations. VMs requesting half of the server capacity (8 vcpus and 14 GB
memory each). Note that as they are interactive applications,
Our proposed controller can guarantee the performance they are submitted with the high QoS requirement, as they can
for the high QoS VMs after the learning process but during be more affected by other co-located VMs than the deadline
the learning or under completely new situations there is no oriented applications, similarly to the scenario presented in [5],
guarantee. Although, combining the Q-learning with the fuzzy [21]. As regards to the workload, a number of queries have
logic can speedup the learning process (as stated in the been generated using information extracted from different days
previous section), such performance degradation may not be of the Wikipedia traces [22], and time-shifted one of them 12
acceptable for the high QoS users even for a short-term. To hours, as shown in Figure 4, creating different trends, peaks,
address this problem, our proposed learning controller calls and daily usage patters. The client queries were generated
a reactive controller to act on its behalf once the response using the httpmon tool1 .
time exceeds the set target threshold. In this situation the
reactive controller establishes the maximum isolation level for 1 https://github.com/cloud-control/httpmon
1
The second VM class consists of different types of rel-
Membership
Degree
atively short living, non-interactive VMs. Firstly, in order to
increase the uncertainty in the system, and thus creating a more
realistic aggregated workload, we created different VMs that 0
run shell scripts that consume random amounts of CPU and 50 100 150 200 250 300
Response
Time
350 400 450 500
Membership
Degree
On the other hand, to have a measurable performance of
this class of VMs, we have also created different types of
VMs that continuously solve random sudokus2 and report their
corresponding throughput achieved over time. Additionally, 0
50 100 150 200 250 300 350 400 450 500
these sudoku VMs can be differentiated by their two main Response Time
behaviors: (b)
Membership
Degree
a given deadline. This type of VM behaves in a best
effort manner, trying to solve as many sudokus as
possible and therefore using as much CPU time as
available. 0
0 1 2 3 4 5 6 7 8
• SudokuT: keeps a certain throughput (number of Number
of
Cores
solved sudokus) over time. If during one time period (c)
the target is not achieved, the sudokus queue up, and
need to be solved in the next period (i.e., open-loop). Figure 5: Fuzzy sets. (a) For the response time of the RUBBoS
We mix two different SudokuT types, with a target of VMs, (b) For the response time of the RUBiS VMs, (c) For
11 and 22 solved sudokus per second, respectively. the number of cores of both RUBBoS and RUBiS VMs.
8
CC6
600
600
8
8
FQL
500
500
6
6
6
Response Time (ms)
Shared Cores
400
400
Shared Cores
Shared Cores
4
300
300
4
4
200
200
2
2
2
100
100
0
0
0
0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400 0 0.1 0.3 0.5 0.7 0.9 1
(a) CC6 (b) FQL (c) Shared Cores Summary (1st day)
Figure 6: RUBiS performance (1st day).
On the other hand, we evaluate the overall utilization and the RUBBoS workload peak. Not only the overall response
number of shared cores over time, as well as the number time is reduced, but also the fluctuations, both in number and
of sudoku VMs concurrently running. The throughput size. These differences are even more remarkable at the third
achieved by the sudoku VMs is also measured to ensure that day (Figure 7c and Figure 8c), where thanks to the learning,
low QoS applications performance is acceptable, i.e., they keep the FQL approach remarkably reduces the response times, as
a specific throughput over time and/or meet their deadlines. well as presents a more constant behavior compared to CC6.
Finally, in order to conclude whether there is an evidence Again, the response time reduction is statistically significant,
of statistically significant improvement in relation with any not only for the third day but also for the complete experiment
of the previously described metrics, we use the Wilcoxon (Wilcoxon p-value: 2.2e-16) As in Figure 6, it can be clearly
statistical test [24], a non-parametric statistical hypothesis test seen a much more stable number of cores shared over time for
that compares two related samples to assess whether their the FQL approach.
population mean ranks differ.
Regarding the learning process over time, Figure 8 clearly
Figure 6 shows the performance comparison between CC6 highlights the learning of the FQL-based controller, taking
and FQL as regards to the RUBiS application performance. advantage of the gathered knowledge about the RUBBoS daily
As Figures 6a and 6b depict, there is no remarkable difference patters, and making response time fluctuations much lower
at the response time since the RUBiS VM presents a linear over time. It must also be noted that, due to the transitions
performance regarding the number of request, which facilitates between proactive and reactive mechanisms, the performance
to keep the response times within the desired limits. However, is acceptable during the three days. The FQL calls the reactive
it may be highlighted that the trend for FQL is slightly flatter controller to act on its behalf when response time is not good
due to sharing a more stable number of cores over time. In fact, and there is not enough learning knowledge yet gathered. That
the Wilcoxon statistical test concludes that the reduction on the is the reason for the short peaks during the first days. Note
response time (both average and 95th) is statistically significant these peaks also happen in CC6 approach. Due to the high load,
(p-value < 5.0e-07). It can be seen at Figure 6c that the amount the actuation needs to be done in a proactive way, otherwise
of cores shared over time is pretty similar for both approaches even by not sharing any of the cores (as CC6 does), the VM
(FQL shares slightly more cores), sharing all the cores around needs some time to recover the already queued requests. We
half of the time for both approaches. However, as higlighted can see at Figure 8c that, even though the response time was
by Figures 6a and 6b, the number of cores shared over time already good during the second day, it is even improved during
quickly and largely fluctuates for CC6, while remains more the third day by sharing a slightly less amount of cores, in
stable for our approach. favour of a more constant and predictable behavior. If we
analyze together the sharing behavior during the third day, for
Figure 7 and Figure 8 show the same comparison between both FQL agents, i.e., for RUBiS and RUBBoS applications
CC6 and FQL but for the RUBBoS application, which presents (see Figure 8c and Figure 9b) and compared then with the
a less linear behavior with respect to the number of requests behavior obtained for CC6 (see Figure 7c and Figure 9a), we
per second. In this case, as the performance fluctuates more, can appreciate that thanks to the RUBBoS FQL agent sharing
we show its evolution over three days, to also highlight the a less amount of cores, the RUBBoS application present a
learning of the FQL approach. Although the performance better behavior than before. What is more, as both FQL
is reasonable good for the first day for both approaches agents become aware of the system evolution over time, the
(Figure 7a and Figure 8a), with slights improvements for RUBiS FQL agent is able to shared a larger amount of cores
the FQL, the performance is improved day by day for the (remarkable different compared with CC6 for RUBiS during
FQL approach. In Figure 7b and Figure 8b the differences the third day – Figure 9a) as the RUBiS VM is more easy
between them are noticeable, with an overall reduction of to control – more linear behavior regarding the number of
the response times, specially towards the end coinciding with request per second than the RUBBoS VM. Consequently, the
Average Response Time Average Response Time Average Response Time
95 Percentile Response Time 95 Percentile Response Time 95 Percentile Response Time
Shared Cores Shared Cores Shared Cores
600
600
600
8
8
500
500
500
6
6
Response Time (ms)
400
400
Shared Cores
Shared Cores
Shared Cores
300
300
300
4
4
200
200
200
2
2
100
100
100
0
0
0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400
Time (min.) Time (min.) Time (min.)
600
600
8
8
500
500
500
6
6
Response Time (ms)
400
400
Shared Cores
Shared Cores
Shared Cores
300
300
300
4
4
200
200
200
2
2
100
100
100
0
0
0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400 0 200 400 600 800 1100 1400
Time (min.) Time (min.) Time (min.)
controllers learned that it was better to be more aggressive number of cores shared by both FQL agents can be seen in
in the sharing policy for RUBiS and more conservative for Figure 11c, where the boxplot clearly highlights the smaller
RUBBoS, as performance deviations in the former can be fixed variation in the overall number of cores shared over time, as
quickly just by removing a few shared cores. well as the higher average. Additionally, the evolution over
time is depicted at Figure 12, where the smaller fluctuations
All in all, a much more stable sharing for both FQL agents, over time are also clearly highlighted. Once again, there is
with less and smaller fluctuations, is achieved. Additionally, as statistical proof of both FQL agents sharing more cores than
depicted in Figure 10, the FQL approach is also able to share the CC6 approach, with Wilcoxon p-values of 1.98e-07 and
more cores over time, specially reducing the amount of time 2.2e-16 for RUBiS and RUBBoS, respectively. Additionally,
where no cores were shared for the RUBBoS VM (from 30% a 35% reduction on the standard deviation of the amount of
of the time to just 10%), enabling the option to accept more cores shared over time was achieved by our FQL controller.
VMs and therefore taking better advantage of the available
resources. As shown in previous figures, the self-learning and self-
adaptive FQL agents are able to share more cores and in a
In addition to the learning process highlighted at the more stable way over time even without knowledge about
RUBBoS response times, Figure 11 shows the differences the status of the infrastructure (i.e., the server). Consequently,
between the amount of cores shared by the capacity controller the FQL-based capacity controller presents a more efficient
for the complete three days, for both RUBiS and RUBBoS, sharing mechanism, where more cores are shared, and in turn
highlighting the FQL agents learning over time. For RUBiS, more VMs can be accepted. This is highlighted in Figures 13a
as the performance was good all the time, there are no large and 13b, where a histogram for the amount of sudoku VMs
variations among days. However, for RUBBoS we can see the concurrently running is depicted. The histogram for FQL is
controller clearly shares fewer cores for the last day (mainly shifted to the right compared to the CC6 approach. Only
due to the peak during the last part of the day) based on the the first bars are larger for CC6, which means there are
knowledge acquired during the previous days, leading to an fewer concurrently running sudoku VMs during longer period
improved performance as shown in Figure 8c. The aggregated of time. By contrast, all the bars for the higher amount
16
8
8
1st 1st
2nd 2nd
14
3rd 3rd
6
12
Shared Cores.
Shared Cores
Shared Cores
10
4
8
6
2
4
0
2
0 0.1 0.3 0.5 0.7 0.9 1 0 0.1 0.3 0.5 0.7 0.9 1 CC6 FQL
Normalized Time Normalized Time
(a) RUBiS (FQL) (b) RUBBoS (FQL) (c) Aggregated cores shared
Figure 11: Cores Sharing: Learning over time.
8
CC6
600
FQL
500
6
6
Response Time (ms)
Shared Cores
400
Shared Cores
4
300
4
200
2
2
100
0
0
16
500
14
Aggregated shared cores
6
Response Time (ms)
400
12
Shared Cores
300
10
4
8
200
6
100
4
0
250
16
200
200
14
Number of Sudoku VMs
12
150
150
Frequency
Frequency
10
100
100
8
6
50
50
4
0
2
0 5 10 15 0 5 10 15 CC6 FQL
(a) Histogram CC6 (b) Histogram FQL (c) Number of concurrent sudoku VMs overtime
CC6
FQL Although resource overbooking has been previously stud-
ied [25], there are still no complete solutions that make
80
Sudoku BestEffort and/or avoiding VMs interference, such as [4], [27], [28],
or [29]. However, their solutions are either based on detecting
20