Sei sulla pagina 1di 3

OPTIMIZATION FOR SPECULATIVE EXECUTION IN BIGDATA PROCESSING

CLUSTERS
ABSTRACT:
A big parallel processing job can be delayed substantially as long as one of its many tasks is
being assigned to an un reliable or congested machine. To tackle this so-called straggler problem,
most parallel processing frameworks such as MapReduce haveadopted various strategies under
which the system may speculatively launch additional copies of the same task if its progress
isabnormally slow when extra idling resource is available. In this paper, we focus on the design
of speculative execution schemes forparallel processing clusters from an optimization
perspective under different loading conditions. For the lightly loaded case, we analyse and
propose one cloning scheme, namely, the Smart Cloning Algorithm (SCA) which is based on
maximizing the overall system utility.Wealso derive the workload threshold under which SCA
should be used for speculative execution. For the heavily loaded case, we proposethe Enhanced
Speculative Execution (ESE) algorithm which is an extension of the Microsoft Mantri scheme to
mitigate stragglers. Oursimulation results show SCA reduces the total job flowtime, i.e., the job
delay/ response time by nearly 6% comparing to the speculativeexecution strategy of Microsoft
Mantri. In addition, we show that the ESE Algorithm outperforms the Mantri baseline scheme by
71% interms of the job flowtime while consuming the same amount of computation resource

#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar,
Vellore 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218
Website: www.shakastech.com, Email - id: shakastech@gmail.com,
info@shakastech.com

Proposed System
This work is an attempt to combine job scheduling andspeculative execution for the design of
redundancy algorithmsin big data processing clusters. More importantly, wefocus on two key
performance metrics which are the averagejobflowtime and the overall system computation
costs. Byutilizing the distribution information of the task servicetime, we build an optimization
framework to maximize theoverall system utility. We then design two approximationalgorithms
to tackle this optimization problem, i.e., theSCA Algorithm and ESE Algorithm, corresponding
to thecloning-based and detection-based approaches respectively.To differentiate the applicability
of these two algorithms,we also categorize the cluster into the lightly loaded andheavily loaded
cases and derive the cutoff threshold forthese two operating regimes.As future work, we will
design speculative executionschemes for more complex jobs which can have additionaltaskdependency constraints. In addition, we plan to characterizethe theoretical performance bounds
of our proposedredundancy algorithms.
Software specification:
Hardware Requirements:
System

I3 Processor 2.4 GHz

Speed

RAM

- 1GB

Hard Disk

- 80 GB

Mouse

- Logitech

2.1 Ghz

#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar,
Vellore 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218
Website: www.shakastech.com, Email - id: shakastech@gmail.com,
info@shakastech.com

Monitor

15 VGA Color.

Software Requirements:
Operating System

:Windows/XP/7.

Application Server

: Tomcat 5.0/6.0

Front End

: HTML, Java, Jsp

Scripts
Server side Script

: JavaScript.
: Java Server Pages.

Database

: MongoDB

Database Connectivity

: Robomongo-0.8.5-i386.

#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar,
Vellore 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218
Website: www.shakastech.com, Email - id: shakastech@gmail.com,
info@shakastech.com

Potrebbero piacerti anche