Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Scalable
Economical
Efficient
Reliable
Data
Node
Master Node
Name
node
Data
Node
Introduction to Hadoop and HDFSABHISHEK VERMA
Slave Node
Data
Node
HDFS Architecture
Nodes
Data Center
Switch
Rack Switches
Name Node(can
only be one per
cluster)
Data Nodes(Can be
many)
Introduction to Hadoop and HDFSABHISHEK VERMA
HDFS Architecture
HDFS INTERNALS
Application
HDFS Client
HDFS namenode
/user/css534/input
File namespace
block 3df2
instructions
(block id, byte range)
block data
state
HDFS datanode
Linux local file system
HDFS datanode
Linux local file system
File Write
1. open
HDFS
client
client JVM
3. read
6. close
Distributed
FileSystem
FSData
InputStream
1. create
HDFS
client
client JVM
3. write
7. close
Distributed
FileSystem
FSData
OutputStream
2. create
NameNode
name node
8. complete
client
client node
node
4. read from the closest node
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
data node
data node
data node
data node
data node
data node
If a data node crashed, the crashed node is removed, current block receives a newer id so
as to delete the partial data Introduction
from the crashed
node later,
and Namenode allocates an
to Hadoop
and HDFSanother node.
ABHISHEK VERMA
Configuring HDFS
Three files I have to edit to configure HDFS.
1. Core-Site.xm
2. Mapred-site.xml
3. Hadoop-env.sh
Files to Edit
Hadoop Environment
Setup
IMPRACK AWARENESS
Form a lookup
file with all IP
in it
HADOOP IN DETAIL
Questions?