Sei sulla pagina 1di 17

Hadoop 3.

x Installation with HA – Automatic Failover

Hosts Details:

IP Address FQDN Hostname Role in Storage Layer Role in Processing Layer


Namenode, NFS Client,
192.168.56.181 h3n1.hadoop.com h3n1 Resource Manager
Zookeeper, ZKFC
Namenode, NFS Client,
192.168.56.182 h3n2.hadoop.com h3n2 NA
Zookeeper, ZKFC
Namenode, Datanode,
192.168.56.183 h3n3.hadoop.com h3n3 Node Manager
Zookeeper
192.168.56.184 h3n4.hadoop.com h3n4 Namenode,Datanode Node Manager
192.168.56.185 h3n5.hadoop.com h3n5 Datanode Node Manager
192.168.56.186 h3edge.hadoop.com h3edge Edge Node, NFS Server Eco System Tools

TAR Balls need to be downloaded for this installation:

TAR Ball Name Download Location TAR Ball location in VM


hadoop-3.0.0-alpha4.tar https://archive.apache.org/dist/hadoop/core/hadoop-3.0.0-alpha4/ /var/www/html/hadoop_tools/

Clone the VMs and change the IPs addresses as above.

1. Setup Password-less SSH for hadoop cluster installation

NOTE: This step should be followed in all the masters (Active NN, Stand by NN, RM, etc)

rm -rf ~/.ssh/id_rsa*
ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa
ls -ltr ~/.ssh
for i in 192.168.56.{181,182,183,184,185,186}; do sshpass -p welcome1 ssh-copy-id $i; done

2. Add below on any one of the master host

sudo vi /etc/clustershell/groups.d/local.cfg

nn: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184


jn: 192.168.56.181 192.168.56.182 192.168.56.183
dn: 192.168.56.183 192.168.56.184 192.168.56.185
zk: 192.168.56.181 192.168.56.182 192.168.56.183
rm: 192.168.56.181 192.168.56.182 192.168.56.183
hadoop: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184 192.168.56.185
all: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184 192.168.56.185 192.168.56.186

sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg


192.168.56.182:/etc/clustershell/groups.d/local.cfg
sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg
192.168.56.183:/etc/clustershell/groups.d/local.cfg
sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg
192.168.56.184:/etc/clustershell/groups.d/local.cfg

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

clush -g all -b "date"

3. Configure NTPD Service.

clush -g all -b "sudo sed -i 's/^server /#server /g' /etc/ntp.conf"


clush -g all -x 192.168.56.181 -b "echo 'server 192.168.56.181 prefer' | sudo tee -a /etc/ntp.conf >
/dev/null 2>&1"

If you don't have internet access to your hosts:

clush -w 192.168.56.181 -b "echo 'server 127.127.1.0' | sudo tee -a /etc/ntp.conf > /dev/null 2>&1"
clush -w 192.168.56.181 -b "echo 'fudge 127.127.1.0 stratum 10' | sudo tee -a /etc/ntp.conf > /dev/null
2>&1"

Restart NTPD Service & Sync Time:

clush -g all -b "sudo systemctl restart ntpd"


clush -g all -x 192.168.56.181 -b "/usr/sbin/ntpdate -d 192.168.56.181"
clush -g all -x 192.168.56.181 -b "/usr/sbin/ntpq -p"

clush -g all -b "date"

4. Download hadoop 3x tarball

Download hadoop-3.0.0-alpha4.tar.gz from internet and untar as below

http://mirror.fibergrid.in/apache/hadoop/common/

From Clush node:

clush -g all -b "sudo unlink /usr/local/hadoop" > /dev/null 2>&1;


clush -g all -b "sudo rm -rf /usr/local/hadoop-3.0.0-alpha4"
clush -g all -b "sudo tar -xvzf /var/www/html/hadoop_tools/hadoop-3.0.0-alpha4.tar.gz -C /usr/local/"
clush -g all -b "du -sch /usr/local/hadoop-3.0.0-alpha4"
clush -g all -b "sudo ln -s /usr/local/hadoop-3.0.0-alpha4 /usr/local/hadoop"
clush -g all -b "sudo chown -R hdpuser:hdpadmin /usr/local/hadoop*"
clush -g all -b "ls -ltr /usr/local | grep -i hadoop"

6. Setup HOME paths

sudo vi /etc/profile

Copy below to the end of file.

export JAVA_HOME=/usr/java/default
export ZOOKEEPER_HOME=/usr/local/zookeeper
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_HOME_WARN_SUPPRESS=1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_ROOT_LOGGER="WARN,DRFA"

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export YARN_HOME=$HADOOP_HOME
export YARN_HOME_WARN_SUPPRESS=1
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME_WARN_SUPPRESS=1
export HADOOP_COMMON_HOME=$HADOOP_HOME

PATH=$PATH:$HOME/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${ZOOKEEPER_HOME}/bin:${J
AVA_HOME}/bin
export PATH

Copy /etc/profile file to all other nodes from h3n1

clush -g all -x 192.168.56.181 --copy /etc/profile --dest /tmp/


clush -g all -x 192.168.56.181 "sudo cp /tmp/profile /etc/"

clush -g all -b "source /etc/profile"

Add above EXPORT commands to env files (excluding PATH).

sudo vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
sudo vi /usr/local/hadoop/etc/hadoop/yarn-env.sh

7. Change xml files as below

sudo vi /usr/local/hadoop/etc/hadoop/core-site.xml

<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>

<property>
<name>ha.zookeeper.quorum</name>
<value>h3n1.hadoop.com:2181,h3n2.hadoop.com:2181,h3n3.hadoop.com:2181</value>
</property>

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://h3n1.hadoop.com:8485;h3n2.hadoop.com:8485;h3n3.hadoop.com:8485/mycluster</v
alue>
</property>

<property>

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

<name>topology.script.file.name</name>
<value>/usr/local/hadoop/etc/hadoop/topology.sh</value>
</property>

<property>
<name>fs.trash.interval</name>
<value>360</value>
</property>

<property>
<name>fs.trash.checkpoint.interval</name>
<value>2</value>
</property>

sudo vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml

<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>

<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>h3n1,h3n2,h3n3, h3n4</value>
</property>

<property>
<name>dfs.namenode.rpc-address.mycluster.h3n1</name>
<value>h3n1.hadoop.com:9000</value>
</property>

<property>
<name>dfs.namenode.rpc-address.mycluster.h3n2</name>
<value>h3n2.hadoop.com:9000</value>
</property>

<property>
<name>dfs.namenode.rpc-address.mycluster.h3n3</name>
<value>h3n3.hadoop.com:9000</value>
</property>

<property>
<name>dfs.namenode.rpc-address.mycluster.h3n4</name>
<value>h3n4.hadoop.com:9000</value>
</property>

<property>
<name>dfs.namenode.http-address.mycluster.h3n1</name>
<value>h3n1.hadoop.com:9870</value>
</property>

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

<property>
<name>dfs.namenode.http-address.mycluster.h3n2</name>
<value>h3n2.hadoop.com:9870</value>
</property>

<property>
<name>dfs.namenode.http-address.mycluster.h3n3</name>
<value>h3n3.hadoop.com:9870</value>
</property>

<property>
<name>dfs.namenode.http-address.mycluster.h3n4</name>
<value>h3n4.hadoop.com:9870</value>
</property>

<property>
<name>dfs.replication</name>
<value>3</value>
</property>

<property>
<name>dfs.block.size</name>
<value>268435456</value>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>file:/mnt/disk1/name </value>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>file:/mnt/disk1/data,file:/mnt/disk2/data,file:/mnt/disk3/data</value>
</property>

<property>
<name>dfs.journalnode.edits.dir</name>
<value>/mnt/disk1/jnedits</value>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>

<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>

<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>

<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hdpuser/.ssh/id_rsa</value>
</property>

<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence
shell(/bin/true)
</value>
</property>

<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/mnt/disk1/snn</value>
</property>

<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>/mnt/disk1/snn</value>
</property>

<property>
<name>dfs.namenode.checkpoint.period</name>
<value>3600</value>
</property>

<property>
<name>dfs.ha.log-roll.period</name>
<value>600</value>
</property>

<property>
<name>dfs.acl.enable</name>
<value>true</value>
</property>

sudo vi /usr/local/hadoop/etc/hadoop/slaves

192.168.56.183

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

192.168.56.184
192.168.56.185

Create directories in namenodes and datanodes. From Clush node:

clush -g nn -b "sudo mkdir -p /mnt/disk1/name"


clush -g jn -b "sudo mkdir -p /mnt/disk1/jnedits"
clush -g nn -b "sudo mkdir -p /mnt/disk1/snn"
clush -g dn -b "sudo mkdir -p /mnt/disk1/data"
clush -g dn -b "sudo mkdir -p /mnt/disk2/data"
clush -g dn -b "sudo mkdir -p /mnt/disk3/data"
clush -g all -b "sudo chown -R hdpuser:hdpadmin /mnt"

Copy all the XMLs to other nodes.

clush -g all -x 192.168.56.181 "sudo rm -rf /tmp/hadoop"


clush -g all -x 192.168.56.181 --copy /usr/local/hadoop/etc/hadoop --dest /tmp/
clush -g all -x 192.168.56.181 "sudo cp -r /tmp/hadoop /usr/local/hadoop/etc/"

8. Create Log directories to store logs and copy updated xmls to other machines.

clush -g all -b "sudo mkdir /usr/local/hadoop/logs"


clush -g all -b "sudo chmod 777 -R /usr/local/hadoop/logs"
clush -g all -b "sudo chown hdpuser:hdpadmin -R /usr/local/hadoop/logs"

9. Setting up zookeeper.

Add below on the master host

clush -g zk -b "date"

clush -g zk -b "sudo unlink /usr/local/zookeeper" > /dev/null 2>&1;


clush -g zk -b "sudo rm -rf /usr/local/zookeeper-3.4.6"
clush -g zk -b "sudo tar -xvzf /var/www/html/hadoop_tools/zookeeper-3.4.6.tar.gz -C /usr/local/"
clush -g zk -b "du -sch /usr/local/zookeeper-3.4.6"
clush -g zk -b "sudo ln -s /usr/local/zookeeper-3.4.6 /usr/local/zookeeper"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /usr/local/zookeeper*"
clush -g zk -b "ls -ltr /usr/local/ | grep -i zookeeper"

Change the zookeeper configuration file

sudo cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg


sudo sed -i 's/^dataDir/#dataDir/g' /usr/local/zookeeper/conf/zoo.cfg
sudo vi /usr/local/zookeeper/conf/zoo.cfg

Comment dataDir property and add below at the end of the file:

dataDir=/mnt/disk1/zkdata
server.1=h3n1.hadoop.com:2888:3888

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

server.2=h3n2.hadoop.com:2888:3888
server.3=h3n3.hadoop.com:2888:3888

Copy zookeeper folder to all other hosts

clush -g zk -x 192.168.56.181 "sudo rm -rf /tmp/conf "


clush -g zk -x 192.168.56.181 --copy /usr/local/zookeeper/conf --dest /tmp/
clush -g zk -x 192.168.56.181 "sudo cp -r /tmp/conf /usr/local/zookeeper/"

clush -g zk -b " echo; echo -e "ZOO_LOG_DIR=/usr/local/zookeeper/logs" | sudo tee -a


/usr/local/zookeeper/bin/zkEnv.sh > /dev/null"
clush -g zk -b "sudo mkdir -p /mnt/disk1/zkdata"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /mnt/disk1/zkdata"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /usr/local/zookeeper*"
clush -g zk -b "sudo touch /mnt/disk1/zkdata/myid"
clush -w 192.168.56.181 -b "echo 1 | sudo tee /mnt/disk1/zkdata/myid > /dev/null"
clush -w 192.168.56.182 -b "echo 2 | sudo tee /mnt/disk1/zkdata/myid > /dev/null"
clush -w 192.168.56.183 -b "echo 3 | sudo tee /mnt/disk1/zkdata/myid > /dev/null"
clush -g zk -b "cat /mnt/disk1/zkdata/myid"

zookeeper myid file should show as below

h3n1 - 1
h3n2 - 2
h3n3 - 3

Start Zookeeper in all the nodes.

clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"

To check whether zookeeper is working fine.

clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh status"

or in each node, with their hostname

zkCli.sh -server h3n1.hadoop.com:2181

clush -g all -b "jps | grep -v Jps; echo;"

9. Setup Rack Topology

Rack Awareness: Create topology.sh file as below.

sudo vi /usr/local/hadoop/etc/hadoop/topology.sh

#==================================
while [ $# -gt 0 ] ; do
nodeArg=$1
exec< /usr/local/hadoop/etc/hadoop/topology.data
result=""

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

while read line ; do


ar=( $line )
if [ "${ar[0]}" = "$nodeArg" ] ; then
result="${ar[1]}"
fi
done
shift
if [ -z "$result" ] ; then
echo -n "/default"
else
echo -n "$result "
fi
done
#==================================

sudo chmod 755 /usr/local/hadoop/etc/hadoop/topology.sh

Create topology.data file as below.

sudo vi /usr/local/hadoop/etc/hadoop/topology.data

192.168.56.181 /rack1
192.168.56.182 /rack2
192.168.56.183 /rack1
192.168.56.184 /rack2
192.168.56.185 /rack2

hdfs dfsadmin -printTopology

10. Start Hadoop Daemons

Format ZKFC service:

clush -g nn -b "/usr/local/hadoop/bin/hdfs zkfc -formatZK -force"


clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"

Start Journal Nodes:

clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"

Only for the first time activities:

In Active NN:

hdfs namenode -format


hdfs --daemon start namenode

In all Standby NNs:

hdfs namenode -bootstrapStandby

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

hdfs --daemon start namenode

clush -g all -b "jps | grep -v Jps; echo;"

Check below folders:

clush -g nn -b "ls /mnt/disk1/name/current/"


clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/"

Start DataNodes

clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"

clush -g all -b "jps | grep -v Jps; echo;"

Check status of each Name Node

hdfs haadmin -getServiceState h3n1


hdfs haadmin -getServiceState h3n2
hdfs haadmin -getServiceState h3n3
hdfs haadmin -getServiceState h3n4

Fail over to another node.

hdfs haadmin -failover h3n1 h3n2


hdfs haadmin -failover h3n2 h3n3
hdfs haadmin -failover h3n3 h3n4
hdfs haadmin -failover h3n4 h3n1
hdfs haadmin -getServiceState h3n1
hdfs haadmin -getServiceState h3n2
hdfs haadmin -getServiceState h3n3
hdfs haadmin -getServiceState h3n4

Check below folders:

clush -g nn -b "ls /mnt/disk1/name/current/"


clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/current/"
clush -g dn -b "ls /mnt/disk1/data/current/"
clush -g dn -b "ls /mnt/disk2/data/current/"
clush -g dn -b "ls /mnt/disk3/data/current/"

To save namespace

hdfs dfsadmin -safemode enter


hdfs dfsadmin -saveNamespace
hdfs dfsadmin -safemode leave

10. Start Hadoop Storage Layer

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

Stop if any services started earlier.

clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon stop datanode"


clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop namenode"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon stop journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop zkfc"
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh stop"
clush -g all -b "jps | grep -v Jps; echo;"

To start entire cluster:

clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"


clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
clush -g all -b "jps | grep -v Jps; echo;"

To see the fsimage & edits files

clush -g nn -b "ls /mnt/disk1/name/current/"


clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/current/"

seen_txid: This contains the last transaction ID of the last checkpoint (merge of edits into a fsimage) or edit log roll
(finalization of current edits_inprogress and creation of a new one). The file is not updated on every transaction,
only on a checkpoint or an edit log roll.

committed-txid: Tracks last transaction ID committed by a NameNode

last-promised-epoch: When NN becomes active, it increments the last-promised-epoch. While writing edits to Edit
log, NN will send this epoch to JN to confirm the latest Active NN. Edits from previous Active will be discorded.

last-writer-epoch: This contains the epoch number associated with the NN who last actually wrote a transaction.

Command to roll the edits manually:

hdfs dfsadmin -rollEdits

11. Some interesting points about storage layer

Check Default Hadoop values in 2.X:

sudo jar -tf /usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-alpha4.jar | grep core-


sudo jar -tf /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-alpha4.jar | grep hdfs-

sudo jar -xf /usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-alpha4.jar core-


default.xml
sudo jar -xf /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-alpha4.jar hdfs-default.xml

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

14. To start YARN

clush -g rm -b "date"

In H3N1,

sudo vi /usr/local/hadoop/etc/hadoop/yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mycluster</value>
</property>

<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>h3n1,h3n2,h3n3</value>
</property>

<property>
<name>yarn.resourcemanager.hostname.h3n1</name>
<value>h3n1.hadoop.com</value>
</property>

<property>
<name>yarn.resourcemanager.hostname.h3n2</name>
<value>h3n2.hadoop.com</value>
</property>

<property>
<name>yarn.resourcemanager.hostname.h3n3</name>
<value>h3n3.hadoop.com</value>
</property>

<property>
<name>yarn.resourcemanager.webapp.address.h3n1</name>
<value>h3n1.hadoop.com:8088</value>

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

</property>

<property>
<name>yarn.resourcemanager.webapp.address.h3n2</name>
<value>h3n2.hadoop.com:8088</value>
</property>

<property>
<name>yarn.resourcemanager.webapp.address.h3n3</name>
<value>h3n3.hadoop.com:8088</value>
</property>

<property>
<name>yarn.resourcemanager.zk-address</name>
<value>h3n1.hadoop.com:2181,h3n2.hadoop.com:2181,h3n3.hadoop.com:2181</value>
</property>

<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.client.failover-proxy-provider</name>
<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
</property>

<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>

<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>

<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/apps/yarn/logs</value>
</property>

<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>1296000</value>
</property>

<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASS
PATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

</property>

sudo cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml
sudo vi /usr/local/hadoop/etc/hadoop/mapred-site.xml

add below between <configuration> and </configuration>

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

<property>
<name>mapreduce.jobhistory.address</name>
<value>h3n1.hadoop.com:10020</value>
</property>

<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>h3n1.hadoop.com:19888</value>
</property>

Copy all the XMLs to other nodes.

clush -g all -x 192.168.56.181 "sudo rm -rf /tmp/hadoop"


clush -g all -x 192.168.56.181 --copy /usr/local/hadoop/etc/hadoop --dest /tmp/
clush -g all -x 192.168.56.181 "sudo cp -r /tmp/hadoop /usr/local/hadoop/etc/"

Start YARN daemons:

clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon stop resourcemanager"


clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon stop nodemanager"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon start resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon start nodemanager"
clush -g all -b "jps | grep -v Jps; echo;"

In H3N1:
mapred --daemon stop historyserver
mapred --daemon start historyserver

yarn rmadmin -getServiceState h3n1


yarn rmadmin -getServiceState h3n2
yarn rmadmin -getServiceState h3n3

To stop entire cluster:

clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon stop historyserver"


clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon stop resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon stop nodemanager"

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop namenode"


clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon stop datanode"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon stop journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop zkfc"
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh stop"
clush -g all -b "jps | grep -v Jps; echo;"

To start entire cluster:

clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"


clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon start resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon start nodemanager"
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon start historyserver"
clush -g all -b "jps | grep -v Jps; echo;"

http://192.168.56.181:9870
http://192.168.56.181:8088

To maintain, Log aggregation,

hdfs dfs -mkdir -p /apps/yarn/logs


hdfs dfs -chmod -R 777 /apps
hdfs dfs -mkdir -p /tmp
hdfs dfs -chmod -R 777 /tmp
hdfs dfs -ls /

Check jars are working fine or not.

yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar

Running a mapreduce program:

hdfs dfs -rm -r /out


yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar
wordcount /sample.txt /out

Check for MRAppMaster, YarnChild by checking jps command as below while running the job. We can see how RM
is launching these and running the jobs.

clush -g all -b "jps | grep -v Jps; echo;"

Check YARN logs thru UI:

http://192.168.56.181:8088/

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

If you are using VMs from Windows, add the host details to C:\Windows\System32\drivers\etc to resolve
the hostname and show logs.

Check Old YARN logs thru UI:

http://h3n1.hadoop.com:19888/

This is history server URL.

Check YARN Logs thru command line:

yarn application -list -appStates ALL

yarn logs -applicationId application_1464914540546_0002

Check for tracking URL and see the logs in browser.

To stop entire cluster:

clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon stop historyserver"


clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon stop resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon stop nodemanager"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon stop datanode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop zkfc"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon stop journalnode"
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh stop"
clush -g all -b "jps | grep -v Jps; echo;"

To start entire cluster:

clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"


clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon start resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon start nodemanager"
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon start historyserver"
clush -g all -b "jps | grep -v Jps; echo;"

20. To delete entire Hadoop installation,

clush -g all -b "sudo unlink /usr/local/hadoop" > /dev/null 2>&1;


clush -g all -b "sudo rm -rf /usr/local/hadoop*"
clush -g zk -b "sudo unlink /usr/local/zookeeper" > /dev/null 2>&1;
clush -g zk -b "sudo rm -rf /usr/local/zookeeper*"
clush -g nn -b "sudo umount -l /mnt/disk1/nfsedits" > /dev/null 2>&1;
clush -g all -b "sudo rm -rf /mnt/disk1/*"

By: Venkata Narasimha Rao B, Contact: +91 9342707000


Hadoop 3.x Installation with HA – Automatic Failover

clush -g all -b "sudo rm -rf /mnt/disk2/*"


clush -g all -b "sudo rm -rf /mnt/disk3/*"
clush -g all -b "sudo ls /mnt/*"
clush -g all -b "sudo sed -i '/JAVA_HOME/,\$d' /etc/profile"
clush -g nn -b "sudo sed -i '/nn:/,\$d' /etc/clustershell/groups.d/local.cfg"

For any questions, email to narasimha.v.rao.b@gmail.com

By: Venkata Narasimha Rao B, Contact: +91 9342707000

Potrebbero piacerti anche