Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
HADOOP CLUSTERS
Dr G Sudha Sadasivam
Assistant Professor
Department of CSE
PSGCT
Introduction
Physical machine can have a number of smaller
virtual machines (VMs), each running a separate
operating system instance.
Challenges
partitioning of a machine
concurrent execution of multiple operating systems
Isolation of virtual machines from one another
Support heterogeneity of applications
Low performance overhead
Objective
Automation of creation and deletion of a virtual
cluster for hosting Hadoop using Xen
A large physical cluster can be simulated on few
physical machines
Steps
Input user configuration by editing configuration files.
Generates user specified number of VM running
Hadoop.
Users can manage the Hadoop file system
Users can submit jobs for each physical machine.
Steps in implementing
Enhancements
1. Providing a graphical console for monitoring and
managing virtual cluster.
2. Creation and Migration of virtual machine for the
purpose of load balancing.
3. Enabling snapshot of the virtual machine. For
checkpointing
4. Providing Intelligent Monitoring System which
could detect the failure of a virtual machine in the
cluster and restarts the particular virtual machine
increasing the reliability.
7 Nodes
Data nodes
6 Virtual nodes
Name node
1 physical node
7 Nodes
Data nodes
1 physical node +
5 Virtual nodes
Name node
1 virtual node