Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
hPot-Tech
hPot-Tech
Tool implementation :
hPot-Tech
Packaging a Job
a jobs classes must be packaged into a job JAR file to send to the cluster
Any dependent JAR files can be packaged in a lib subdirectory in the job JAR file.
The client classpath
The users client-side classpath set by hadoop jar <jar> is made up of:
The job JAR file
Any JAR files in the lib directory of the job JAR file, and the classes directory.
The classpath defined by HADOOP_CLASSPATH, if set
hPot-Tech
Launching a Job
To launch the job, we need to run the driver, specifying the cluster that we want to run
the job on with the -conf option
hPot-Tech
hPot-Tech
hPot-Tech
hPot-Tech
hPot-Tech
10
hPot-Tech
11
hPot-Tech
12
hPot-Tech
13
hPot-Tech
14
hPot-Tech
15
hPot-Tech
16
hPot-Tech
17
hPot-Tech
18
hPot-Tech
19
hPot-Tech
20
Data Structures
Key-value pairs are the basic data structure in MapReduce
Keys and values can be: integers, float, strings, raw bytes
They can also be arbitrary data structures
The design of MapReduce algorithms involes:
Imposing the key-value structure on arbitrary datasets
o E.g.: for a collection of Web pages, input keys may be URLs and
values may be the HTML content
In some algorithms, input keys are not used, in others they uniquely
identify a record
Keys can be combined in complex ways to design various algorithms
hPot-Tech
21
A MapReduce job
The programmer defines a mapper and a reducer as follows2:
o map: (k1; v1) ! [(k2; v2)]
o reduce: (k2; [v2]) ! [(k3; v3)]
A MapReduce job consists in:
o A dataset stored on the underlying distributed filesystem, which is
split in a number of files across machines
o The mapper is applied to every input key-value pair to generate
intermediate key-value pairs
o The reducer is applied to all values associated with the same
intermediate key to generate output key-value pairs
hPot-Tech
22
hPot-Tech
23
Figure: Mappers are applied to all input key-value pairs, to generate an arbitrary number of intermediate pairs. Reducers are applied to all
intermediate values associated with the same intermediate key. Between the map and reduce phase lies a barrier that involves a large distributed
sort and group by
hPot-Tech
24
hPot-Tech
25
hPot-Tech
26
hPot-Tech
27
Restrictions
Using external resources
E.g.: Other data stores than the distributed file system
Concurrent access by many map/reduce tasks
Side effects
Not allowed in functional programming
E.g.: preserving state across multiple inputs
State is kept internal
I/O and execution
External side effects using distributed data stores (e.g. BigTable)
No input (e.g. computing _), no reducers, never no mappers
hPot-Tech
28
hPot-Tech
29
hPot-Tech
30
hPot-Tech
31
hPot-Tech
32
Debugging a Job
The web UI (debug statement to log to standard error)
custom counter
hPot-Tech
33
hPot-Tech
34
hPot-Tech
35
hPot-Tech
36
Hadoop Logs
hPot-Tech
37
Anything written to standard output or standard error is directed to the relevant log file.
hPot-Tech
38
Remote Debugging
debugger is hard to arrange when running the job on a cluster
options :
o Reproduce the failure locally
o Use JVM debugging options
o Use task profiling
o Use IsolationRunner
set keep.failed.task.files to true to keep a failed tasks files.
hPot-Tech
39
Tuning a Job
hPot-Tech
40
Tuning a Job
hPot-Tech
41
Job Submission
JobClient class
The runJob() method creates a new instance of a JobClient
Then it calls the submitJob() on this class
Simple verifications on the Job
Is there an output directory?
Are there any input splits?
Can I copy the JAR of the job to HDFS?
NOTE: the JAR of the job is replicated 10 times
hPot-Tech
42
MapReduce Workflows
o When the processing gets more complex :
o As a rule of thumb, think about adding more jobs, rather than adding complexity to jobs.
o For more complex problems,
o Consider a higher-level language than Map-Reduce, such as Pig, Hive, Cascading,
Cascalog, or Crunch.
o One immediate benefit is that it frees you from the translation into MapReduce jobs,
allowing you to concentrate on the analysis you are performing.
hPot-Tech
43
JobControl:
hPot-Tech
44
hPot-Tech
45
Classic MapReduce
hPot-Tech
46
Failures
Major benefits of using Hadoop is its ability to handle failures and allow job to complete.
Task failure:
When user code in the map or reduce task throws a runtime exception.
The error ultimately makes it into the user logs.
Hanging tasks are dealt with differently : mapred.task.timeout
When the jobtracker is notified of a task attempt that has failed (by the tasktrackers
heartbeat call), it will reschedule execution of the task.
The jobtracker will try to avoid rescheduling the task on a tasktracker where it has previously
failed
hPot-Tech
47
Failures
Tasktracker failure :
The jobtracker will notice a tasktracker that has stopped sending heartbeats if it hasnt received
one for 10 minutes (configured via the mapred.task tracker.expiry.interval property, in
milliseconds)
And remove it from its pool of tasktrackers to schedule tasks on.
Jobtracker failure
Failure of the jobtracker is the most serious failure mode.
Hadoop has no mechanism for dealing with jobtracker failureit is a single point of failure
so in this case all running jobs fail.
After restarting a jobtracker, any jobs that were running at the time it was stopped will need to
be resubmitted
hPot-Tech
48
hPot-Tech
49
Partitioners
Partitioners are responsible for:
Dividing up the intermediate key space
Assigning intermediate key-value pairs to reducers
Specify the task to which an intermediate key-value pair must be copied
Hash-based partitioner
Computes the hash of the key modulo the number of reducers r
This ensures a roughly even partitioning of the key space
o However, it ignores values: this can cause imbalance in the data processed by
each reducer
When dealing with complex keys, even the base partitioner may need customization
hPot-Tech
50
Combiners
Combiners are an (optional) optimization:
Allow local aggregation before the shuffle and sort phase
Each combiner operates in isolation
Essentially, combiners are used to save bandwidth
E.g.: word count program
Combiners can be implemented using local data-structures
E.g., an associative array keeps intermediate computations and aggregation thereof
The map function only emits once all input records (even all input splits) are
processed
hPot-Tech
51
hPot-Tech
52
hPot-Tech
53
hPot-Tech
54
hPot-Tech
55
Mapper
hPot-Tech
56
Reducer
hPot-Tech
57
Tutorial : MRUnit.
hPot-Tech
58
hPot-Tech
59
hPot-Tech
60
MapReduce Types
Input / output to mappers and reducers
a. map: (k1; v1) ! [(k2; v2)]
b. reduce: (k2; [v2]) ! [(k3; v3)]
In Hadoop, a mapper is created as follows:
a. void map(K1 key, V1 value, OutputCollector<K2,V2> output, Reporter reporter)
b.
Types:
a. K types implement WritableComparable
b. V types implement Writable
hPot-Tech
61
What is a Writable
Hadoop defines its own classes for strings (Text), integers
(intWritable), etc...
All keys are instances of WritableComparable
o Why comparable?
All values are instances of Writable
hPot-Tech
62
hPot-Tech
63
Reading Data
Datasets are specified by InputFormats
I InputFormats define input data (e.g. a file, a directory)
I InputFormats is a factory for RecordReader objects to extract
key-value records from the input source
InputFormats identify partitions of the data that form an InputSplit
InputSplit is a (reference to a) chunk of the input processed by
a single map
o Largest split is processed first
Each split is divided into records, and the map processes each
record (a key-value pair) in turn
Splits and records are logical, they are not physically bound to a file
hPot-Tech
64
hPot-Tech
65
hPot-Tech
66
hPot-Tech
67
Record Readers
Each InputFormat provides its own RecordReader implementation
LineRecordReader
Reads a line from a text file
KeyValueRecordReader
Used by KeyValueTextInputFormat
hPot-Tech
68
hPot-Tech
69
hPot-Tech
70
WritableComparator
Compares WritableComparable data
Will call the WritableComparable.compare() method
Can provide fast path for serialized data
Configured through:
JobConf.setOutputValueGroupingComparator()
hPot-Tech
71
Partitioner
int getPartition(key, value, numPartitions)
Outputs the partition number for a given key
One partition == all values sent to a single reduce task
HasPartitioner used by default
Uses key.hashCode() to return partion number
JobConf used to set Partitioner implementation
hPot-Tech
72
The Reducer
void reduce(k2 key, Iterator<v2> values,OutputCollector<k3, v3> output, Reporter
reporter )
Keys and values sent to one partition all go to the same reduce task
Calls are sorted by key
Early keys are reduced and output before late keys
hPot-Tech
73
hPot-Tech
74
hPot-Tech
75
hPot-Tech
76
hPot-Tech
77
Joins
MapReduce can perform joins between large datasets
hPot-Tech
78
Join:
hPot-Tech
79
Map-Side Joins
A map-side join between large inputs works by
performing the join before the data reaches the map
function.
The inputs to each map must be partitioned and
sorted in a particular way.
Each input dataset must be divided into the
same number of partitions, and it must be sorted by
the same key (the join key) in each source.
All the records for a particular key must reside in
the same partition.
hPot-Tech
80
Reduce-Side Joins
A reduce-side join is more general than a mapside join
the input datasets dont have to be structured in
any particular way
the mapper tags each record with its source and
uses the join key as the map output key, so that the
records with the same key are brought together in
the reducer.
hPot-Tech
81
hPot-Tech
HDFS Architecture
Metadata ops
Metadata(Name, replicas..)
(/home/foo/data,6. ..
Namenode
Client
Block ops
Read
Datanodes
Datanodes
replication
B
Blocks
Rack1
Write
Rack2
Client
3/3/2013
Operating System
As Hadoop is written in Java, it is mostly portable
between different operating systems
Installation Steps
Installed java
ssh and sshd
gunzip hadoop-0.18.0.tar.gz
Or tar vxf hadoop-0.18.0.tar
Additional Configuration
conf/masters
contains the hostname of the SecondaryNameNode
It should be fully-qualified domain name.
conf/slaves
the hostname of every machine in the cluster which
should start TaskTracker and DataNode daemons
Ex:
slave01
slave02
slave03
Advance Configuration
enable passwordless ssh
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Advance Configuration
Various directories should be created on each
node
The NameNode requires the NameNode metadata
directory
$ mkdir -p /home/hadoop/dfs/name
Advance Configuration..
bin/slaves.sh allows a command to be
executed on all nodes in the slaves file.
$ mkdir -p /tmp/hadoop
$ export HADOOP_CONF_DIR=${HADOOP_HOME}/conf
$ export HADOOP_SLAVES=${HADOOP_CONF_DIR}/slaves
$ ${HADOOP_HOME}/bin/slaves.sh "mkdir -p /tmp/hadoop"
$ ${HADOOP_HOME}/bin/slaves.sh "mkdir -p /home/hadoop/dfs/data
Format HDFS
$ bin/hadoop namenode -format
Important Directories
Directory
Description
Default location
Suggested location
HADOOP_LOG_DIR
${HADOOP_HOME}/logs
/var/log/hadoop
hadoop.tmp.dir
/tmp/hadoop-${user.name}
/tmp/hadoop
dfs.name.dir
${hadoop.tmp.dir}/dfs/name /home/hadoop/dfs/name
dfs.data.dir
mapred.system.dir
/home/hadoop/dfs/data
${hadoop.tmp.dir}/mapred/sy
/hadoop/mapred/system
stem
Recommended configuration
dfs.name.dir and dfs.data.dir be moved out
from hadoop.tmp.dir.
Adjust mapred.system.dir
Selecting Machines
Hadoop is designed to take advantage of
whatever hardware is available
Hadoop jobs written in Java can consume
between 1 and 2 GB of RAM per core
If you use HadoopStreaming to write your jobs
in a scripting language such as Python, more
memory may be advisable.
Cluster Configurations
Small Clusters: 2-10 Nodes
Medium Clusters: 10-40 Nodes
Large Clusters: Multiple Racks
configuration in conf/hadoop-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>head.server.node.com:9001</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://head.server.node.com:9000</val
ue>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/dfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/dfs/name</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop</value>
<final>true</final>
</property>
<property>
<name>mapred.system.dir</name>
<value>/hadoop/mapred/system</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
NameNodes backup
The cluster's hadoop-site.xml file should then
instruct the NameNode to write to this
directory as well:
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/dfs/name,/mnt/namenode-backup</value>
<final>true</final>
</property>
Backup NameNode
the backup machine can be used for is to
serve as the SecondaryNameNode
this is not a failover NameNode process
It takes periodic snapshots of its metadata
conf/hadoop-site.xml
Nodes must be decommissioned on a schedule that permits
replication of blocks being decommissioned.
conf/hadoop-site.xml
<property>
<name>dfs.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>
<property>
<name>mapred.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>
create an empty file with this name:
$ touch /home/hadoop/excludes
Replication Setting
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512m</value>
</property>
Tutorial
Configure Hadoop Cluster in two nodes.
Tutorial-Installed Hadoop in Cluster.docx
Range
io.file.buffer.size
32768-131072
io.sort.factor
50-200
io.sort.mb
50-200
mapred.reduce.parallel.copies
20-50
Description
tasktracker.http.threads
40-50
mapred.tasktracker.map.tasks.maximum