Future Generation Computer Systems: Anubhav Choudhary Indrajeet Gupta Vishakha Singh Prasanta K. Jana

Future Generation Computer Systems 83 (2018) 14–26
Contents lists available at ScienceDirect
Future Generation Computer Systems

journal homepage: www.elsevier.com/locate/fgcs
A GSA based hybrid algorithm for bi-objective workflow scheduling in

cloud computing
Anubhav Choudhary, Indrajeet Gupta, Vishakha Singh, Prasanta K. Jana *,1
Department of Computer Science and Engineering, Indian Institute of Technology (ISM), Dhanbad, India
highlights
• Proposed an efficient hybrid scheme of GSA and HEFT, called HGSA for workflow scheduling.
• Systematic derivation of fitness function based on makespan and cost.
• Novelty in introducing a proficient elimination strategy of inferior agents.
• Demonstration of better performance through simulation results and statistical test ANOVA.
article info a b s t r a c t
Article history: Workflow Scheduling in cloud computing has drawn enormous attention due to its wide application
Received 28 February 2017 in both scientific and business areas. This is particularly an NP-complete problem. Therefore, many
Received in revised form 12 October 2017 researchers have proposed a number of heuristics as well as meta-heuristic techniques by considering
Accepted 3 January 2018
several issues, such as energy conservation, cost and makespan. However, it is still an open area of research
Available online 8 January 2018
as most of the heuristics or meta-heuristics may not fulfill certain optimum criterion and produce near
optimal solution. In this paper, we propose a meta-heuristic based algorithm for workflow scheduling that
Keywords:
Gravitational Search Algorithm considers minimization of makespan and cost. The proposed algorithm is a hybridization of the popular
Workflow scheduling meta-heuristic, Gravitational Search Algorithm (GSA) and equally popular heuristic, Heterogeneous
Cost Earliest Finish Time (HEFT) to schedule workflow applications. We introduce a new factor called cost time
Makespan equivalence to make the bi-objective optimization more realistic. We consider monetary cost ratio (MCR)
Cost time equivalence and schedule length ratio (SLR) as the performance metrics to compare the performance of the proposed
algorithm with existing algorithms. With rigorous experiments over different scientific workflows, we
show the effectiveness of the proposed algorithm over standard GSA, Hybrid Genetic Algorithm (HGA)
and the HEFT. We validate the results by well-known statistical test, Analysis of Variance (ANOVA). In all
the cases, simulation results show that the proposed approach outperforms these algorithms.
© 2018 Elsevier B.V. All rights reserved.
1. Introduction resources which are provisioned dynamically. However, the allo-

cation of the resources and the order in which the execution of
Workflow has wide applications in business as well as in the tasks of a given workflow will be performed, are matters of great
scientific areas such as astronomy, weather forecasting, medical importance. This is commonly referred as workflow scheduling
and bio-informatics. Generally, these workflows are vast in size problem. In fact, workflow scheduling is an NP-complete problem
as they consist of a large number of independent and/or depen- which has been extensively studied for other paradigms, such
dent tasks and thus they demand huge infrastructure for their as grid and cluster computing. It is noteworthy that if there are
computation, communication, and storage. Clouds [1] provide such n tasks in a workflow and m available virtual machines (VMs),
an infrastructure in order to execute the workflow on virtualized then there exist mn different ways in which the tasks can be
mapped to the VM pool. For a large value of n and m, finding
an optimal solution by brute force approach is computationally
* Corresponding author. very expensive. Therefore, a meta-heuristic approach can be very
E-mail addresses: anubhav.choudhary@live.com (A. Choudhary),
indrajeet7830@gmail.com (I. Gupta), vs.make.a.vish@gmail.com (V. Singh), effective for solving this problem. However, every meta-heuristic
prasantajana@yahoo.com (P.K. Jana). algorithm has its own merits and demerits. Hybridization of such
1 IEEE Senior Member. meta-heuristic approaches has evidence to produce better results
https://doi.org/10.1016/j.future.2018.01.005
0167-739X/© 2018 Elsevier B.V. All rights reserved.
A. Choudhary et al. / Future Generation Computer Systems 83 (2018) 14–26 15
[2,3] and therefore this has become the recent trend of research in • Validation of the performance through the statistical test
cloud computing. ANOVA.
Heterogeneous Earliest Finish Time (HEFT) [4] is an efficient
heuristic proposed for task scheduling in heterogeneous multipro- The rest of the paper is organized as follows. Related works are
cessor which is also used for cloud computing [5–7]. This algorithm stated in Section 2. Section 3 explains the application and cloud
maps each task arranged in a priority order to a VM for which model. Section 4 describes the terminologies used in the paper
the earliest finish time is minimum. It should be noted that it is and the problem statement. Section 5 presents the proposed work
essentially a single objective algorithm which can only optimize with an illustration. Performance metrics, experimental results,
the makespan. The Gravitational Search Algorithm (GSA) [8] is and comparison are discussed in Section 6 followed by Section 7
a popular meta-heuristic approach which utilizes the concept of which concludes the paper.
the law of gravitation to find the near-optimal solution. The algo-
rithm starts with a set of random particles, where each particle 2. Related works
represents a solution and the mass of each particle is calculated
using a fitness function based on the application. Particles with Many heuristic and meta-heuristic based algorithms have been
higher fitness value have the higher mass and, hence, it can exert proposed for workflow scheduling in cloud computing. In this
more force to attract other particles towards it. Eventually, all section, we present a short review of some of the works that are
the particles converge towards an optimal point. It is capable relevant to our proposed scheme.
of obtaining a global optimum faster than other meta-heuristic HEFT [4] is a popular heuristic which was initially developed
algorithms and, hence, has a higher convergence rate. Moreover, it for task scheduling in heterogeneous multiprocessor systems. It is
provides better results than the Central Force Optimization (CFO) well-known that HEFT performs better than many other heuris-
and Particle Swarm Optimization (PSO) as demonstrated in [8]. tics such as [12,13] for task scheduling. However, it considers
In this paper, we propose a meta-heuristic based algorithm the minimization of makespan only. An extension of HEFT called
for workflow scheduling problem which is a hybrid of the HEFT Pareto Optimal Scheduling Heuristic (POSH) was proposed by Su
and the GSA. Specifically, we address the following workflow et al. [5] for workflow scheduling in cloud, to minimize makespan
scheduling problem. Given a workflow consisting of a set of tasks and cost of execution. The POSH produces acceptable solution.
T = {t1 , t2 , . . . , tn } with their computational load and precedence Nevertheless, the solution is derived from a constricted search
constraints, and also given a set of VMs, V = {v1 , v2 , . . . , vm }, our space and thus, it may miss on the better solutions. An energy
objective is to map all the tasks to the available VMs so that the efficient scheduling with deadline constraint for heterogeneous
entire workflow can be executed in minimum time and minimum cloud environment was proposed in [14]. In this work, a new VM
computational cost. The proposed algorithm is presented with an scheduler is developed which is shown to reduce energy consump-
efficient agent representation and systematic derivation of fitness tion in the execution of workflows. They claimed to achieve up
function. The algorithm is extensively simulated using the scien- to 20% reduction in energy requirement and improvement in the
tific workflows of different sizes and is shown to produce better processing capacity by 8%. Fard et al. [15] proposed another heuris-
results as compared to other related algorithms such as Hybrid tic called multi-objective list scheduling (MOLS), which provides a
Genetic Algorithm (HGA), GSA and HEFT. We use ANOVA [9], a sta- general framework for multi-objective static workflow scheduling.
tistical test to validate the simulation results. This test determines It supports four objectives namely, makespan, cost, reliability, and
if a given result set has significant difference statistically with other energy. Based on selected objectives it provides the execution plan.
set of results. Abrishami et al. [16] adopted the Partial Critical Path (PCP) for
Many algorithms have been proposed for workflow scheduling workflow scheduling and designed two algorithms, a one-phase
in cloud computing which are based on meta-heuristic approaches. algorithm which is called IaaS Cloud Partial Critical Paths (IC-PCP)
For instance, Rodriguez et al. [10] have proposed a PSO based and a two-phase algorithm called IaaS Cloud Partial Critical Paths
algorithm with the objective of minimizing the execution cost with Deadline Distribution (IC-PCPD2). Here, homogeneous cloud
while meeting the deadline constraints. Similarly, HGA has been environment is assumed.
presented in [3] that also has a single objective, i.e., minimizing Recently, Casas et al. [17] have proposed balanced and
the makespan. Many other meta-heuristic approaches have been file reuse-replication scheduling (BaRRS) algorithm to schedule
developed for workflow scheduling, the survey of which can be workflows that are based on two optimization constraints, i.e.,
found in [11]. However, our approach is different from all such makespan and cost. They have also focused on finding an optimal
approaches and has the following novelty. We consider parameters number of VMs required for a given workflow. However, it has
such as communication bandwidth, output data size of each task, large computational overhead. Panda et al. [18] have developed a
VM boot time, VM shutdown time and performance variability of normalization based task scheduling for a heterogeneous multi-
VMs in order to create a more realistic environment for scheduling. cloud environment. This technique provides a way to schedule
Most of these features are absent in the existing works. Moreover, tasks over multiple cloud providers. In another work [19], they
our approach deals with two objectives in contrast to single objec- have proposed a modification of min–min algorithm with uncer-
tive in many existing algorithms. Hybridization of GSA with HEFT is tainty parameter for scheduling tasks in a heterogeneous multi-
also novel in the sense that there is full exploitation of the benefits cloud environment. Gupta et al. [20] have also reported a work-
of both these algorithms. flow scheduling algorithm for the multi-cloud environment. How-
Our contribution can be summarized as follows. ever, this work focuses more on compute intensive workflows for
scheduling.
• A hybrid algorithm based on GSA and HEFT to minimize Meta-heuristics are well-known techniques to obtain near op-
makespan and total computational cost. timal solution. For instance, Pandey et al. [21] proposed a PSO
• An efficient agent representation and derivation of system- based workflow scheduling algorithm for cost optimization. It is
atic fitness function. designed to consider computational cost and data transmission
• Introduction of cost time equivalence and a procedure for cost to provide an execution plan such that overall cost is min-
eradicating the inferior agents. imized. However, this approach has been tested only on limited
• Demonstration of better performance of the algorithm workflow applications. Jena et al. [22] proposed a multi-objective
through extensive simulation and comparison with other nested Particle Swarm Optimization (TSPSO) algorithm for work-
heuristic/meta-heuristic based approaches. flow scheduling to optimize energy as well as processing time.
16 A. Choudhary et al. / Future Generation Computer Systems 83 (2018) 14–26
3.2. Cloud server model
We assume a cloud server which contains set of m VMs, rep-

resented by V = {v1 , v2 , v3 , . . . , vm }. Each VM has its own com-
putational power measured in terms of million instructions per
second (MIPS). All VMs are fully connected to each other and may
reside in one or more physical cloud server. The time required to
transfer output data from task ti to tj is described as communica-
tion overhead time. Note that combination overhead time is the
ratio of output data size of a task ti to the bandwidth between the
VMs. If both ti and tj execute on the same VM, then communication
overhead is assumed to be zero.
4. Terminologies and problem statement
4.1. Constraints and assumptions
The notations used in the proposed work are given in Table 1.

The proposed algorithm considers the following constraints and
Fig. 1. An example of workflow: Nodes represent the task and the edges represent
assumptions similar to the work as presented in [10].
precedence relation. The numeric value 7 inside the node t2 is the computational
load of the task t2 and the label 11 is the size of the data generated by t2 .
1. We consider performance variance for the VMs to calcu-
late effective cpu cycles for the execution of the tasks. The
reasons behind this variability is due to heterogeneity and
However, a crucial scheduling factor, i.e, bandwidth between the shared nature of infrastructure of the underlying hardware.
VMs is not considered in this work. Moreover, no experiment was Based on a survey [27], it is found that an overall perfor-
reported on large-scale workflows. mance variability of Amazon’s EC2 cloud is 24%. For the VM
Many GA based solutions have also been proposed. Garg vj , performance variance is represented as degvj . Thus, using
et al. [23] proposed a hybrid GA driven by linear programming (LP) degvj , execution time of task ti on VM vj can be written as
for cost optimization in the grid computing. This work combines
vj Load(ti )
the capabilities of both LP and GA to find a schedule such that ETti = (1)
it minimizes the combined cost of all users of the grid. Wang Capacity(vj ) × (1 − degvj )
et al. [24] proposed a look-ahead genetic algorithm (LAGA) to vj
where ETti is execution time of task ti on VM vj , Load(ti ) is
optimize both reliability and makespan. But this work focused
computational load of task ti and Capacity(vj ) is computa-
only on the compute intensive workflows and did not consider the tional capacity of VM vj .
communication time, which is a very vital factor for scheduling 2. The unit chargeable time τ is considered to calculate the cost
workflows. of execution. If we utilize the leased VM for time less than τ ,
then also it is charged for a full time period.
3. Models 3. An initial boot time is always required when a VM is
leased. So, we consider VM boot time for calculation of the
makespan. We also consider VM shutdown time as it is
This section contains a detailed description about workflow required to release the provisioned VM.
application model and the cloud server model assumed for the 4. Each VM is assumed to be roughly connected with the same
development of the proposed algorithm. The important terminolo- bandwidth.
gies used throughout the paper are also described in this section.
4.2. Problem formulation

3.1. Workflow application model
For the sake of bi-objective problem formulation, we first de-
The workflow application is represented by a Direct Acyclic scribe the two important parameters, i.e., makespan and cost as
Graph (DAG) [25,26], G = (T , E) as shown in Fig. 1, where T = follows.
{t1 , t2 , . . . , tn } is the set of tasks and E is the set of edges. An edge
Makespan Calculation:
ti → tj indicates the precedence relation between the predecessor
Makespan is also referred as the total execution time for the
ti and the successor tj . Thus, task tj cannot start unless task ti is
entire workflow. It includes the boot time of the leased VM, execu-
complete. Each task ti is labeled with its computational load in
tion time, data transfer time between two VMs and the shutdown
million instructions (MI). Also, the label on each edge ti → tj time of VM. While computing makespan, it is assumed that a VM
indicates the size of the output data generated by ti . This data is cannot execute a task while data is being transferred from it to
required to start the execution of the task tj . The task without any another VM. The makespan is equal to the summation of VM boot
predecessor is termed as entry task (tentry ) and task without any time, maximum of VM-time for all VM and the VM shutdown time.
successor is termed as exit task (texit ). If there are more than one VM-time[i] denotes the last timestamp (starting from zero for each
entry tasks in the workflow, a new pseudo entry task is created workflow) up to which the VM vi executes its assigned task. We
with zero computational load as well as no output data. Then all need to add VM boot time and VM shutdown time only once because
the entry tasks are connected with the pseudo task so formed. only the boot time of first VM and the shutdown time of last
Similarly, pseudo exit task can also be created, if required. VM will contribute to makespan and the rest will be overlapped
Table 1
Notations and definitions.
Notation Definition
N Population size.
n Number of the tasks in a given workflow.
ti ith task.
m Number of available VMs.
vj jth VM.
Xi It represents the ith agent.
Xbest It represents the best agent known so far.
α It determines the weight of makespan and cost for calculation fitness.
β It is the cost makespan equivalence factor. It is a part of SLA and its
value depends on the priority and urgency of the application.
fiti Fitness value of the ith agent based on its makespan and cost.
Mi Mass of the ith agent.
σ A random variable used in the pricing model.
Vcbase Base price based on slowest VM
digvj Performance degradation of VM vj .
γ A small constant, which regulates the declination of gravitational
constant.
δ Threshold mass for replacing the inferior agents.
with other events. Therefore, makespan can be mathematically (6) respectively. Therefore, it is wise to minimize their linear com-
formulated as follows. bination. The workflow scheduling problem can be formulated as
m follows.
Makespan = VM-boot-time + max(VM-time[i])
i=1
Minimize z = α × Makespan + (1 − α ) × MEcost
+ VM-shutdown-time (2) m
∑
subject to (i) Bi,j = 1, i = 1, 2, 3, . . . , n (7)
Cost Calculation:
j=1
In our assumed model, a cloud server consists of VMs with
varying computational capacity for different types of workload. (ii) 0 ≤ α ≤ 1
vj
ETti is the execution time of the task ti on VM vj as defined in Eq. The constraint (i) indicates that any task of the workflow can be
(1). Let τ be the unit chargeable time for which charge of execution assigned to one and only one VM and the constraint (ii) limits the
of any task will be accounted. Let Vcbase be the base price charged
range of α which balances makespan and total cost.
for the slowest VM, then as per the exponential pricing model [5],
the cost of execution of the task ti on VM vj is denoted as, cost(ti , vj )
and is formulated as follows. 5. Proposed work
⌈ vj ⌉
ETti CPU cycles of vj
( )
cost(ti , vj ) = σ × × Vcbase × exp (3) As our algorithm is a hybrid of GSA and HEFT, we first provide
τ slow est CPU cycle
a brief description about both of these algorithms as follows.
where, σ is a random variable used to generate different combina-
tion of VM pricing and capacity. Let Bi,j be a boolean variable, such 5.1. Overview of heterogeneous earliest finish time (HEFT)
that
1, if task ti is assigned to vj
{ }
Bi,j = (4) The main idea behind HEFT [4] is to schedule tasks in such a way
0, otherwise that earliest finish time (EFT ) is minimized for all the tasks. HEFT
Therefore, the total cost of execution for a workflow is defined as is executed in two phases which is described as follows.
n m
∑ ∑ Phase 1: Calculating priority of tasks
Total Cost = Bi,j × cost(ti , vj ) (5)
In this phase, the priority of each task is calculated
i=1 j=1
using average execution time and average communi-
However, the values of makespan and cost may have different cation time. The priorities are calculated in a bottom -
scales. So, the value of one attribute may overwhelm the value up approach. The sequence of tasks generated based on
of the other attribute, in the case of agent evaluation. Thus, it higher to lower priority value, satisfying the precedence
is not valid to perform a linear formulation directly using the constraints of the given workflow. The priority of task ti
actual values. One of the institutive approaches is to normalize
is given by
both makespan and cost in order to scale these values in the same
range. Normalization can be done using any of the well-known
pri(ti ) = wi + max
( )
ci,j + pri(tj ) (8)
methods such as min–max normalization, Z-score normalization, tj ∈succ(ti )
etc. However, this solution has a major drawback, i.e, if there is a
change in global minimum and maximum values of makespan and where, wi is the average execution time of task ti on
cost, it may lead to the variation in the relative rank of agents in available VMs and ci,j is the average communication
two consecutive iterations. To resolve this issue, we use makespan time between task ti and task tj .
equivalent of total cost (MEcost ) calculated using Eq. (6) instead of Phase 2: Mapping tasks to VMs
total cost. This is the main phase of the algorithm, where actual
mapping of task to VM is performed according to the
MEcost = β × Total Cost (6)
priority of the tasks. The task with the highest priority
Problem Statement: is scheduled first, by calculating earliest start time (EST )
The objective of the proposed work is to minimize makespan and EFT on all available VMs.
and makespan equivalent of total cost as given in Eqs. (2) and
Table 2
Example of a Mapping / Agent.
Task t1 t2 t3 t4 t5 t6 t7 t8
VM 1 3 2 1 5 1 3 1
5.3. Agent representation
In the proposed algorithm, a mapping of the tasks of a given

workflow to the VMs, is considered as an agent. For workflow
having n tasks and cloud server having m number of VMs, the ith
agent can be represented as a vector of dimension 1 × n, i.e.,
Xi = x1i , x2i , x3i , . . . , xni

[ ]
(9)
xdi
where, represents the VM assigned to the task td . Note that is xdi
an integer which lies in interval [1, m]. Table 2 shows an example
of an agent for 8 tasks on 5 VMs. Here, task t1 is mapped with VM
v1 , i.e., task t1 will be executed on VM v1 , while preserving the
precedence constraints.
5.4. Fitness evaluation
For a given agent, we can always calculate its makespan and

makespan equivalent of total cost using Eqs. (2) and (6) respec-
tively. Let Makespani and MEcosti
be the makespan and makespan
equivalent of total cost for the ith agent in the population. Now,
the fitness value of ith agent can be computed as
1
Fitness(fiti ) = (10)
1 + α × Makespani + (1 − α ) × MEcost
i
The fitness value calculated using Eq. (10) is an absolute fit-

ness value. As its calculation does not require any information
about the population, the relative difference in fitness value of
two agents will remain same irrespective of the iteration in which
they are present. Agents with lower makespan and lower cost, will
have higher fitness value and they can be considered as potential
Fig. 2. Working of GSA.
candidates for a final solution instead of the agents with higher
makespan or higher cost.
5.2. Overview of gravitational search algorithm (GSA) Remark. The problem of workflow scheduling is to minimize the
linear combination of the makespan and the makespan equivalent
of total cost as described in Section 4.2. As our fitness value is
The GSA is based on the law of gravity [28] introduced by the reciprocal of the same, so the higher fitness value would be
Rashedi et al. [8]. It is a population-based search algorithm where desirable.
each agent is considered as a particle (so, we use agent and particle
In the proposed algorithm, we apply min–max normalization in
interchangeably throughout the paper) and its fitness value is
order scale the fitness values of all the agents in population in the
considered as its mass. Each particle represents a solution of a given range [0, 1] to get the mass of each agent. So, mass of the ith agent
problem. The main idea is that the heavier particles, i.e., superior is given by
solutions do not move much as compared to lighter particles, i.e., ( )
fiti − min fitj
inferior solutions. All particles apply force on every other particle. j=1 to N
As each particle has some mass, its acceleration and velocity can be Mass(Mi ) = ( ) ( ) (11)
max fitj − min fitj
calculated using the net force. Using calculated velocity, the new j=1 to N j=1 to N
position of the particle can be found. When algorithm terminates, We use Eq. (11) as the fitness for our proposed work which is
the particle with highest mass provides the near optimal solution. used in the simulations.
The working of the GSA is depicted in Fig. 2.
To use this algorithm in scheduling, we first identify the search 5.5. Proposed scheduling technique
space and particle representation. Then we initialize the popu-
lation randomly. Now in each iteration, we calculate the fitness The proposed HGSA algorithm works in two phases. The first
phase focuses only on optimization of makespan. In the second
value of all particles using the fitness function defined as per the
phase, it attempts to optimize the cost while trying to minimize
optimization constraints. Based on the best and worst particles
the fitness value which is calculated from both makespan and cost.
identified, we calculate the mass to update the position of each The result from the first phase guides the particle movement in
particle. The gravity constant that is used to calculate the velocity GSA, which is considered in the second phase of the proposed work.
and the position, is also updated in each iteration. We repeat all the This improves the result as compared to GSA with random initial
steps till the algorithm attains a certain termination criterion. particles. We use GSA by incorporating HEFT with following steps.
Step 1: The initial population is seeded with the output of the 5.7. Algorithm
HEFT algorithm. The HEFT heuristic provides guidance
to GSA algorithm that improves the overall performance We start by generating initial population by random mapping
of the proposed algorithm. It helps to generate better of the tasks on VMs in step 1 of the proposed algorithm, followed
solutions in fewer number of iterations. by seeding of the results of HEFT into the population in step 2. Once
Step 2: The best particle identified from current generation based the initial population is ready, a set of iterative steps is applied
on the fitness function is preserved. This is done to ensure to each agent of the population to get the final result as per step
that the best agent does not get degraded in the future 3 through step 16. The first step of iteration is to calculate the
generation.
gravitational constant in step 4. Then in step 5, we compute the
Step 3: The agents having mass less than threshold mass (δ ) is
fitness value of each agent using Eq. (10). Note that Eq. (10) requires
removed from the current population as they have very
the values of both cost and makespan. Algorithms 2 and 3 can be
little or no contribution in updating the population. In
utilized for calculating these values. Based on fitness value, we
place of all the removed agents, new agents are added to
identify the best and the worst agents in step 6 for calculating the
the population generated with the help of the best agent
identified so far. This improves the overall fitness of the mass of all the agents in step 7. In step 8, we update the position
population. of each agent by calculating the net force, net acceleration and
velocity. In remaining steps, we replace the inferior agents by the
new agents generated with the help of best agent known so far. The
5.6. Position update of particle new agent is generated by mapping one of the tasks to a randomly
selected VM and rest of the mapping remains same as the best
Let us consider a system of N agents. We define the position of agent.
the ith agent as follows.
Algorithm 1 : Proposed Workflow Scheduling Algorithm
Xi = x1i , x2i , x3i , . . . , xni for i = 1, 2, 3, . . . , N
[ ]
(12) Input: Workflow Application (W ) and Cloud Server Specification
(CSS)
where, xdi
shows the position of ith agent in the dth dimension.
Let Mi (k) and G(k) be the mass of ith agent and the gravitational Output: Task mapping with VMs (M)
constant respectively in kth iteration. We can define the force
acting on the ith agent by jth agent for the kth iteration as follows. 1 Initialize population X with N randomly generated agents.
Mi (k) × Mj (k) 2 Replace one of the agent by mapping generated by HEFT
Fid,j (k) = G(k) × × xdj (k) − xdi (k)
( )
(13) 3 for k = 1 to MAX_ITERATION do
Ri,j (k) + ϵ
4 Compute gravitational constant G(k) using Eq. (19)
where, ϵ is a very small constant and Ri,j (k) is the Euclidean dis- 5 Compute fitness value fiti for i = 1, 2, 3, ..., N using Eq. (10)
tance between ith and jth agent in the kth iteration. Ri,j (k) is defined 6 Identify best and worst agent based on calculated fitness
as value.
  7 Compute mass Mi for i = 1, 2, 3, ..., N using Eq. (11)
Ri,j (k) = Xi (k) · Xj (k))2 (14) 8 Update velocity and position of each agent using Eq. (13) to
We suppose that the total force that acts on the ith agent in the (18)
dth dimension be a randomly weighted sum of the forces exerted 9 for i = 1 to N do
in the dth dimension from the other agents. Then, 10 if Mi < δ then
11 Pos = a random integer from interval [1, n]
N
∑ 12 xdi = xdbest for d = 1, 2, 3, ..., n.
Fid (k) = randj × Fid,j (k) (15)
13 xPos
i = a random integer from interval [1, m]
j=1,j̸ =i
14 end if
where randj is a random number that lies in the interval [0, 1]. By 15 end for
the law of motion [28], the acceleration of the ith agent in the dth 16 end for
dimension in kth iteration is given by, 17 Find M corresponding to the best agent based on the fiti for i =
1, 2, 3, ..., N
Fid (k)
adi (k) = (16) 18 return M
Mi (k)
Furthermore, the next velocity of the agent is considered as a Algorithm 2 : Cost-Calculation
fraction of its current velocity added to its acceleration. Therefore,
Input: Workflow Application (W ), Mapping (M), Cloud Server
its position and velocity can be calculated as follows.
Specification (CSS)
v eldi (k + 1) = randi × v eldi (k) + adi (k) (17) Output: Cost value (Cost)
xdi (k + 1) = xdi (k) + v eldi (k + 1) (18) 1 Set Cost = 0

2 for each task ti ∈ W do
vM [i] Load(ti )
The gravitational constant G0 , is initialized in the beginning and 3 Execution_time ETti = Capacity(vM [i] )×(1−degvM [i] )
reduces as the algorithm proceeds, in order to improve the search ⌈ vM [i] ⌉
ETt
accuracy. G(k) is a function of initial value G0 and iteration number 4 Unit_used = i
τ
k, defined as
CPU cycles of vM [i]
)γ ( )
Rate_per_unit = σ × Vcbase × exp
(
k0 5
G(k) = G0 × ,γ < 1 (19) slowest CPU cycle
k 6 Cost = Cost + Unit_used × Rate_per_unit
7 end for
where, γ is a small constant that regulates the reduction in the
8 return Cost
gravitational constant.
Algorithm 3 : Makespan-Calculation
Input: Workflow Application (W ), Mapping (M), Cloud Server Specification (CSS)
Output: Makespan value (Makespan)
1 for each vi ∈ V do
2 VM_time[i] = 0
3 end for
4 for each task ti ∈ W in topological order do
5 if ti .ParentCount ! = 0 then
6 Parent_finishtime = max (Tast_actual_finsh_time[k])
tk ∈pred(ti )
7 end if
8 if ti .ChildCount ! = 0 then
9 Transfer_time = 0
10 for each task tj where tj ∈ succ(ti ) and M [i] ̸ = M [j] do
11 if output data of ti task is not transferred to vM [j] then
t .Outputdatasize
12 Transfertime = Transfertime + i Bandw idth
13 end if
14 end for
15 end if
vM [i]
16 Execution_time ETti = Capacity(v Load(t i)
)×(1−deg vM [i] )
M [i]
17 Actual_start_time = max(Parent_finishtime, VM_time[M[i]])
18 Task_actual_finish_time[i] = Actual_start_time + Execution_time + Transfer_time
19 VM_time[M [i]] = Task_actual_finish_time[i]
20 end for
21 Makespan = VM_boot_time + max ( VM_time[i]) + VM_shutdown_time
vi ∈Cloud
22 return Makespan
Table 3
Parameters used in the illustration.
Parameter Value
Number of VMs 4
Computational power of all VMs 2.0, 3.5, 4.5 and 5.5 MIPS
Network bandwidth 1 MBps
Boot time and shutdown time of VM 0.5 sec
Performance variance of VM 24%
MAX _ITERATION 10
Population size (N) 100
Gravitational constant (G0 ) 5
Weight of makespan and cost (α ) 0.5
Cost time equivalence (β ) 1
Small constant used in gravity (γ ) 0.3
Mass threshold for inferior agents (δ ) 0.1
Small constant used in force (ϵ ) 10
is optimized in terms of makespan and cost. Table 3 shows the

parameters used in this illustration and Table 4 shows the initial
population generated, as described in step 1 of our proposed algo-
rithm.
We now compute a schedule using HEFT for the given workflow.
The resultant schedule is then included into the generated popula-
tion. To compute HEFT schedule, we need to find the priority using
(a) Montage workflow of 16 (b) Cloud environment. Eq. (8). Now, starting from the highest priority, each task is mapped
tasks.
to a VM in such a way that it minimizes the EFT . Table 5 shows the
priority as well as the mapping of the tasks to VMs. The makespan
Fig. 3. Workflow and cloud environment. and cost of schedule generated by HEFT are 36.57 sec and $50.31
respectively.
Fig. 4 demonstrates the process of including the mapping gen-
5.8. An illustration erated by HEFT into the initial population by replacing the shaded
agent. The selection of the agent to be replaced is purely random.
Consider a Montage workflow [29,30] consisting of 16 tasks, This completes the step 2 of the algorithm.
T = {t1 , t2 , . . . , t16 } and a set of 4 VMs, V = {v1 , v2 , v3 , v4 } Now, the current population contains HEFT generated agent as
as shown in Fig. 3. We have to schedule the workflow on the well as the agents generated randomly. These agents are processed
given VMs which are fully connected to each other. The output as described in step 3 through step 16, for a certain number of
of this illustration is a mapping of given tasks on the VMs, which iterations. Table 6 shows the details of the best agent identified
Table 4
Initial population of agents.
Agent 1 4 1 1 2 3 4 4 2 3 1 2 3 2 1 1 4
Agent 2 2 3 1 1 1 2 3 4 4 1 1 2 1 3 3 3
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
Agent N 1 2 1 1 1 3 3 4 4 1 2 2 2 2 1 1
Fig. 4. Seeding of HEFT solution into population.
Table 5 Table 6
Task priority and task mapping by HEFT. Iteration wise specification of the best agent.
Task Priority Virtual machine Iteration Makespan Cost Fitness
t1 107.71 3 1 36.575226 50.313644 2.250 × 10−2
t2 103.37 2 2 32.338966 49.307339 2.391 × 10−2
t3 107.74 4 3 32.536972 48.014904 2.422 × 10−2
t4 103.24 4 4 32.642235 47.711269 2.428 × 10−2
t5 94.38 2 5 32.642235 47.711269 2.428 × 10−2
t6 94.44 3 6 32.642235 47.711269 2.428 × 10−2
t7 90.26 2 7 32.642235 47.711269 2.428 × 10−2
t8 90.20 4 8 32.642235 47.711269 2.428 × 10−2
t9 90.05 3 9 32.642235 47.711269 2.428 × 10−2
t10 90.08 2 10 32.642235 47.711269 2.428 × 10−2
t11 90.06 4
t12 90.08 4
t13 77.89 4
6.2. Performance metrics
t14 77.57 4
t15 2.63 4
t16 0.28 4 We normalize the makespan and monetary cost similar to that
in [5] and call them the schedule length ratio (SLR) and monetary
cost ratio (MCR) of tasks, as follows.
in each iteration based on calculated fitness value. The resultant Makespan
SLR = m
(20)
schedule is shown in Table 7 having makespan of 32.64 sec and ∑ { vj
}
min ETti
cost as $47.71. ti ∈CP
j=1
Total Cost
6. Experimental results and comparison MCR = (21)
m
min cost(ti , vj )
∑ { }
ti ∈CP
j=1
This section presents the simulation results of the proposed
algorithm and its comparison with three workflow scheduling The denominator is the summation of the minimum execution
algorithms including the standard GSA based approach, HEFT and time and monetary cost of the tasks on the critical path (CP)
HGA as follows. Note that, for the sake of comparison, we convert without communication cost. For a given task graph, an algorithm
the single objective of the HGA (minimization of makespan) into that produces a scheduling plan with lower SLR and lower MCR
bi-objective, keeping all the constraints same as the proposed value is more effective.
algorithm. We also calculate the normalized fitness value for easy compar-
ison and visualization of the overall quality of the results. We use
max-normalization to normalize the absolute fitness value as cal-
6.1. Experimental setup culated using Eq. (10). After applying normalization, the maximum
value is mapped to one and the rest of the values lie in the interval
The simulations were carried out using C++ coding environment (0,1]. Mathematically, max-normalization is defined as
on an Intel(R) Core(TM) i5-2540M CPU with 2.60 GHz and 4GB xi
x̂i = (22)
RAM running on Linux platform. The specifications of the cloud max (xj )
j=1 to N
environment, as well as the parameters used for evaluation of our
proposed algorithm, are given in Table 8. where, x̂i is the normalized value for xi .
Table 7
Resultant schedule of montage with 16 tasks.
Task t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16
VM 3 1 4 4 2 3 1 1 3 2 4 4 4 4 4 4
Fig. 5. (a) CyberShake (b) Epigenomics (c) Inspiral (d) Montage (e) SIPHT. (For interpretation of the references to color in this figure legend, the reader is referred to the web
version of this article.)
6.3. Dataset Table 8

Parameters used during experiment.
The proposed algorithm is evaluated on various scientific work- Parameter Value

flows as considered by Bharathi et al. [29] and Juve, et al. [30]. Network bandwidth 1 MBps
These workflows are synthesized using the generator program Boot time and shutdown time of VM 0.5 sec
Performance variance of VM 24%
provided by Pegasus project [31]. It uses information gathered
MAX _ITERATION 200
from actual execution of scientific workflows to generate a syn- Population size (N) 500
thetic workflow. This in turn, is a near approximation of a real Gravitational constant (G0 ) 5
workflow. We used CyberShake (IO and network-intensive), Epige- Weight of makespan (α ) 0.5
nomics (both compute-intensive and network-intensive), Inspiral Cost time equivalence (β ) 50
Small constant used in gravity (γ ) 0.3
(compute-intensive), Montage (network-intensive) and SIPHT (IO
Mass threshold for inferior agents (δ ) 0.1
intensive) in the simulation. For all the types of workflow, we di- Small constant used in force (ϵ ) 10
vide them into three categories based on the number of constituent
tasks as shown in Table 9.
Each workflow has some characteristic features, which play
pivotal roles in process of scheduling. The detailed characterization that it is a single objective scheduling algorithm which focuses on
of each workflow can be found in [29]. Topology of tasks in a makespan only.
given workflow is also an major criterion for scheduling. These To calculate the normalized fitness value, we used two input
workflows have variety of topological features such as pipeline parameters, such as α and β as shown in Table 1. The normalized
(yellow task nodes), data aggregation (red task nodes) and data fitness value shows the overall quality as per the user requirement.
partitioning (green task nodes) as shown in Fig. 5. From Fig. 9(a)–(c), we observe that the proposed HGSA algorithm
performs better than the HEFT, HGA and the GSA. We get the
6.4. Result analysis and performance evaluation better results using the HGSA even for the case where SLR is poor
with respect to the HEFT as the difference in cost is enough to
In this subsection, we evaluate the performance of our proposed compensate for the difference in makespan.
algorithm against the HEFT, standard GSA and the HGA with re-
spect to makespan and monetary cost as follows. The MCR, SLR and
6.5. Analysis of variance (ANOVA)
the normalized fitness are used as the performance metrics for the
comparative analysis which are defined in Section 6.2. Note that
the lower values of the SLR and MCR are desirable as they indicate We also conducted hypothesis testing using ANOVA [9]. It is
lower makespan and cost, respectively. However, the higher value a statistical method which compares the mean of two or more
of the normalized fitness is preferred. groups to determine whether there is a significant difference
We present the results obtained by using the same machine among the groups or not. This test has a null hypothesis (H0 ) and
configuration, same constraints, and the same set of workflow an alternate hypothesis (H1 ), defined as
applications (of various sizes and types). Figs. 6–9 show the bar
charts for the MCR, SLR and the normalized fitness value, so as to H0 : µ1 = µ2 = µ3 = ... = µn (23)
compare between the HGSA, HGA, standard GSA, and the HEFT.
From the figures, we can observe that the MCR of the proposed H1 : Means are not equal (24)
HGSA is better as compared to the HEFT, HGA and the GSA for all
the workflow categories, such as small, medium and large. Thus, During the test, if F statistical < F critical then we fail to reject
it is visible that the performance of the proposed HGSA is better the null hypothesis and all groups have the same mean. But if F
than others in terms of MCR for any aforementioned workflows. statistical > F critical then we reject the null hypothesis and accept
We also observe that the value of the SLR obtained by the proposed the alternate hypothesis. If the alternative hypothesis is accepted,
algorithm is much better than the GSA and the HGA. However, it is we can easily conclude that one of the group is having significant
somewhat lesser to that of given by the HEFT. This is due to the fact differences with the others.
(a) Monetary cost ratio. (b) Schedule length ratio.
Fig. 6. Results for small sized workflows.
Fig. 7. Results for medium sized workflows.
Fig. 8. Results for large sized workflows.

(a) Normalized fitness for small sized workflows. (b) Normalized fitness for medium sized workflows.
(c) Normalized fitness value for large sized workflows.
Fig. 9. Comparison of normalized fitness.
Table 9 workflows of various sizes. Tables 10–14 shows the results for each
Category of the workflow based on the number of tasks. workflows of 2000 tasks.
Number of tasks Category As we can see that, for all workflows we have F statistical > F
24 to 60 Small critical. Thus, we can reject the null hypothesis. Therefore, means
100 to 400 Medium of all the groups are significantly different. This implies that the
800 to 2000 Large
performance of HGSA is better and consistent than HGA and GSA.
7. Conclusion
The test was performed to compare the standard GSA, hybrid
In this paper, we have presented a hybrid gravitational search
GA and the hybrid GSA. In order to do this experiment, all three algorithm for scheduling workflows, with the basic objective of re-
algorithms were executed 10 times for each of the five scientific ducing the makespan as well as the cost of execution. The efficiency
Table 10
ANOVA using CyberShake workflow of 2000 tasks.
(a) Summary of input
Group Count Sum Average Variance
HGSA 10 3.51E−05 3.51E−06 3.44E−16
GSA 10 2.96E−05 2.96E−06 5.10E−17
HGA 10 3.17E−05 3.17E−06 2.01E−16
(b) ANOVA test result

Source of variation SS df MS F stat P-value F crit
Between groups 1.56E−12 2 7.81E−13 3925.06 5.278E−34 3.35
Within groups 5.37E−15 27 1.99E−16
Total 1.57E−12 29
Table 11
ANOVA using Epigenomics workflow of 2000 tasks.
HGSA 10 3.60E−07 3.60E−08 5.90E−20
GSA 10 3.00E−07 3.00E−08 1.30E−20
HGA 10 3.10E−07 3.10E−08 5.90E−20

Total 1.60E−16 29
Table 12
ANOVA using Inspiral workflow of 2000 tasks.
HGSA 10 3.90E−06 3.90E−07 3.70E−18
GSA 10 3.50E−06 3.50E−07 4.50E−19
HGA 10 3.60E−06 3.60E−07 2.30E−18

Total 9.70E−15 29
Table 13
ANOVA using Montage workflow of 2000 tasks.
HGSA 10 8.30E−05 8.30E−06 6.10E−16
GSA 10 5.80E−05 5.80E−06 8.80E−17
HGA 10 7.80E−05 7.80E−06 7.90E−15

Total 3.70E−11 29
Table 14
ANOVA using SIPHT workflow of 2000 tasks.
HGSA 10 5.10E−06 5.10E−07 1.90E−17
GSA 10 4.30E−06 4.30E−07 2.60E−18
HGA 10 4.80E−06 4.80E−07 1.50E−17

Total 3.03E−14 29
and usefulness of our algorithm has been demonstrated using var- we would like to apply the proposed approach for scheduling of
ious simulation runs. The results depict that HGSA performs better multiple workflows. It should be noted that in the proposed work
than HGA by 14%, GSA by 20% and the HEFT by 35% with respect to the execution time of a task is calculated based on the number of
the fitness value. This clearly shows the supremacy of our scheme instructions. However, for the complex tasks, this may not work
over its contemporaries. Moreover, the consistent performance of accurately. Thus, in the future work, we would like to support
HGSA has been validated through the statistical test, ANOVA. complex tasks as well. For the sake of the simplicity of the work, we
However, in the proposed approach, we have assumed that assumed that there is no VM failure during the task execution. This
the bandwidth between the VMs is fixed and all the tasks of the opens a possible future extension of this work. Our future efforts
workflow consist of simple programming instructions. Therefore, will be made towards addressing all these issues accompanied by
we would like to extend our approach by considering the scenario, a newly proposed scheduling algorithm with it’s simulation on
where the nature of the available bandwidth is variable.Moreover, federated multi-cloud environment.
References [27] J. Schad, J. Dittrich, J.A. Quiané Ruiz, Runtime measurements in the cloud:
Observing, analyzing, and reducing variance, Proc. VLDB Endow. 3 (1–2) (2010)
460–471.
[28] J. Walker, D. Halliday, R. Resnick, Fundamentals of Physics, Wiley, Hoboken,
[1] R. Buyya, C.S. Yeo, S. Venugopal, J. Broberg, I. Brandic, Cloud computing and NJ, 2008.
emerging IT platforms: Vision, hype, and reality for delivering computing as [29] S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.H. Su, K. Vahi, Characteriza
the 5th utility, Future Gener. Comput. Syst. 25 (6) (2009) 599–616. tion of scientific workflows, in: 2008 Third Workshop on Workflows in Sup-
[2] F. Tao, Y. Feng, L. Zhang, T. Liao, CLPS-GA: A case library and pareto solution- port of Large-Scale Science, IEEE, 2008, pp. 1–10.
based hybrid genetic algorithm for energy-aware cloud service scheduling, [30] G. Juve, A. Chervenak, E. Deelman, S. Bharathi, G. Mehta, K. Vahi, Characterizing
Appl. Soft Comput. 19 (2014) 264–279. and profiling scientific workflows, Future Gener. Comput. Syst. 29 (3) (2013)
[3] S.G. Ahmad, C.S. Liew, E.U. Munir, T.F. Ang, S.U. Khan, A hybrid genetic algo- 682–692.
rithm for optimization of scheduling workflow applications in heterogeneous [31] Pegasus: Workflow generator: (accessed on December. 24th, 2016). https:
computing systems, J. Parallel Distrib. Comput. 87 (2016) 80–90. //github.com/pegasus-isi/WorkflowGenerator.
[4] H. Topcuoglu, S. Hariri, M.Y. Wu, Performance-effective and low-complexity
task scheduling for heterogeneous computing, IEEE Trans. Parallel Distributed
Syst. 13 (3) (2002) 260–274.
[5] S. Su, J. Li, Q. Huang, X. Huang, K. Shuang, J. Wang, Cost-efficient task scheduling Anubhav Choudhary completed his M.Tech in Computer
for executing large programs in the cloud, Parallel Comput. 39 (4) (2013) 177– Science and Engineering from Indian Institute of Technol-
188. ogy (ISM), Dhanbad, India. He is currently working as a
[6] M.A. Vasile, F. Pop, R.-I. Tutueanu, V. Cristea, J. Kołodziej, Resource-aware software engineer in Sandvine Technologies, Bengaluru,
hybrid scheduling algorithm in heterogeneous distributed computing, Future India. His research interests include Cloud Computing and
Gener. Comput. Syst. 51 (2015) 61–71. Machine Learning. He completed his bachelor’s degree
[7] A. Verma, S. Kaushal, Cost-time efficient scheduling plan for executing work- course from Kalinga Institute of Industrial Technology,
flows in the cloud, J. Grid Comput. 13 (4) (2015) 495–506. Bhubaneswar, India.
[8] E. Rashedi, H. Nezamabadi-Pour, S. Saryazdi, GSA: A gravitational search algo-
rithm, Inf. Sci. 179 (13) (2009) 2232–2248.
[9] K.E. Muller, B.A. Fetterman, Regression and ANOVA: An Integrated Approach
using SAS Software, SAS Institute, 2002.
[10] M.A. Rodriguez, R. Buyya, Deadline based resource provisioningand scheduling
algorithm for scientific workflows on clouds, IEEE Trans. Cloud Comput. 2 (2)
(2014) 222–235. Indrajeet Gupta is working as a senior research scholar
[11] F. Fakhfakh, H.H. Kacem, A.H. Kacem, Workflow scheduling in cloud comput- in the Department of CSE at IIT (ISM), Dhanbad, India. He
ing: A survey, in: 2014 IEEE 18th International Enterprise Distributed Object received M. Tech. degree from NIT, Rourkela, India. He
Computing Conference Workshops and Demonstrations (EDOCW), IEEE, 2014, has published more than ten papers in reputed journals
pp. 372–378. and conferences. His current research interests include
[12] J.G. Barbosa, B. Moreira, Dynamic scheduling of a batch of parallel task jobs on workflow scheduling and load balancing in cloud comput-
heterogeneous clusters, Parallel Comput. 37 (8) (2011) 428–438. ing. He acted as reviewers in many reputed journals and
[13] L.F. Bittencourt, R. Sakellariou, E.R. Madeira, Dag scheduling using a lookahead conferences.
variant of the heterogeneous earliest finish time algorithm, in: 2010 18th
Euromicro International Conference on Parallel, Distributed and Network-
Based Processing (PDP), IEEE, 2010, pp. 27–34.
[14] Y. Ding, X. Qin, L. Liu, T. Wang, Energy efficient scheduling of virtual machines
in cloud with deadline constraint, Future Gener. Comput. Syst. 50 (2015) 62–
74.
[15] H.M. Fard, R. Prodan, J.J.D. Barrionuevo, T. Fahringer, A multi-objective ap-
Vishakha Singh received M. Tech. in computer science
proach for workflow scheduling in heterogeneous environments, in: Procee
and engineering from Indian Institute of Technology (ISM),
dings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud
Dhanbad, India. Her research interests include energy and
and Grid Computing (Ccgrid 2012), IEEE Computer Society, 2012, pp. 300–309.
cost-efficient workflow scheduling in cloud computing.
[16] S. Abrishami, M. Naghibzadeh, D.H. Epema, Deadline-constrained workflow
She obtained her bachelor’s degree from Indian institute
scheduling algorithms for infrastructure as a service clouds, Future Gener.
of Information Technology, Jabalpur, India. Apart from her
Comput. Syst. 29 (1) (2013) 158–169.
Masters topic, her expertise also covers wireless sensor
[17] I. Casas, J. Taheri, R. Ranjan, L. Wang, A.Y. Zomaya, A balanced scheduler
networks field.
with data reuse and replication for scientific workflows in cloud computing
systems, Future Gener. Comput. Syst. (2016).
[18] S.K. Panda, P.K. Jana, Normalization-based task scheduling algorithms for
heterogeneous multi-cloud environment, Inf. Syst. Front. (2016) 1–27.
[19] S.K. Panda, P.K. Jana, Uncertainty-based qos min–min algorithm for heteroge-
neous multi-cloud environment, Arab. J. Sci. Eng. 41 (8) (2016) 3003–3025.
[20] I. Gupta, M.S. Kumar, P.K. Jana, Compute-intensive workflow scheduling in
multi-cloud environment, in: 2016 International Conference on Advances in Prasanta K. Jana received M.Tech. degree in Computer
Computing, Communications and Informatics (ICACCI), IEEE, 2016, pp. 315– Science from University of Calcutta, in 1988 and Ph.D. from
321. Jadavpur University in 2000. Currently he is a Professor
[21] S. Pandey, L. Wu, S.M. Guru, R. Buyya, A particle swarm optimization-based in the department of Computer Science and Engineering,
heuristic for scheduling workflow applications in cloud computing environ- IIT (ISM), Dhanbad, India. He has contributed 150 research
ments, in: 2010 24th IEEE International Conference on Advanced Information publications in his credit and co-authored six books and
Networking and Applications, IEEE, 2010, pp. 400–407. two book chapters. He has also produced seven Ph.Ds. As
[22] R. Jena, Multi objective task scheduling in cloud environment using nested pso a recognition of his outstanding research contributions, he
framework, Procedia Comput. Sci. 57 (2015) 1219–1227. has been awarded Senior Member of IEEE (The Institute
[23] S.K. Garg, P. Konugurthi, R. Buyya, A linear programming-driven genetic al- of Electrical and Electronics Engineers, USA) in 2010 and
gorithm for meta-scheduling on utility grids, Int. J. Parallel Emergent Distrib. Canara Bank Research Publication Award in 2015. He is
Syst. 26 (6) (2011) 493–517. in the editorial board of two international journals and acted as referees in many
[24] X. Wang, C.S. Yeo, R. Buyya, J. Su, Optimizing the makespan and reliability for reputed international journals. Dr. Jana has acted as the General Chair of the In-
workflow applications with reputation and a look-ahead genetic algorithm, ternational Conference RAIT-2012, Co-chair of National Conference RAIT-2009 and
Future Gener. Comput. Syst. 27 (8) (2011) 1124–1134. Convener of the workshop WPDC-2008. He has also served as advisory committee
[25] M. Isard, M. Budiu, Y. Yu, A. Birrell, D. Fetterly, Dryad: Distributed data-parallel and programme committee members of several International Conferences. His
programs from sequential building blocks, in: ACM SIGOPS Operating Systems current research interest includes wireless sensor networks, cloud computing and
Review, Vol.41, ACM, 2007, pp. 59–72. big data clustering. He visited University of Aizu in 2003, Japan, Las Vegas, USA in
[26] Y. Xu, K. Li, L. He, T.K. Truong, A dag scheduling scheme on heterogeneous 2008, Imperial college of London, UK in 2011, University of Macau, Macau in 2012
computing systems using double molecular structure-based chemical reaction and Hong Kong in 2015 for academic purpose.
optimization, J. Parallel Distrib. Comput. 73 (9) (2013) 1306–1322.

Future Generation Computer Systems: Anubhav Choudhary Indrajeet Gupta Vishakha Singh Prasanta K. Jana

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Future Generation Computer Systems: Anubhav Choudhary Indrajeet Gupta Vishakha Singh Prasanta K. Jana

Caricato da

Copyright:

Formati disponibili

Future Generation Computer Systems 83 (2018) 14–26

Contents lists available at ScienceDirect

Future Generation Computer Systems

A GSA based hybrid algorithm for bi-objective workflow scheduling in

1. Introduction resources which are provisioned dynamically. However, the allo-

3.2. Cloud server model

We assume a cloud server which contains set of m VMs, rep-

4. Terminologies and problem statement

4.1. Constraints and assumptions

The notations used in the proposed work are given in Table 1.

4.2. Problem formulation

5.3. Agent representation

In the proposed algorithm, a mapping of the tasks of a given

Xi = x1i , x2i , x3i , . . . , xni

5.4. Fitness evaluation

For a given agent, we can always calculate its makespan and

The fitness value calculated using Eq. (10) is an absolute fit-

xdi (k + 1) = xdi (k) + v eldi (k + 1) (18) 1 Set Cost = 0

is optimized in terms of makespan and cost. Table 3 shows the

Fig. 4. Seeding of HEFT solution into population.

6.3. Dataset Table 8

The proposed algorithm is evaluated on various scientific work- Parameter Value

(a) Monetary cost ratio. (b) Schedule length ratio.

Fig. 6. Results for small sized workflows.

(a) Monetary cost ratio. (b) Schedule length ratio.

Fig. 7. Results for medium sized workflows.

(a) Monetary cost ratio. (b) Schedule length ratio.

Fig. 8. Results for large sized workflows.

(c) Normalized fitness value for large sized workflows.

Fig. 9. Comparison of normalized fitness.

(b) ANOVA test result

(b) ANOVA test result

(b) ANOVA test result

(b) ANOVA test result

(b) ANOVA test result

Potrebbero piacerti anche