Hadoop 3.x HA Installation with Automatic Failover

Hadoop 3.
x Installation with HA – Automatic Failover
Hosts Details:
IP Address FQDN Hostname Role in Storage Layer Role in Processing Layer

Namenode, NFS Client,
192.168.56.181 h3n1.hadoop.com h3n1 Resource Manager
Zookeeper, ZKFC
Namenode, NFS Client,
192.168.56.182 h3n2.hadoop.com h3n2 NA
Zookeeper, ZKFC
Namenode, Datanode,
192.168.56.183 h3n3.hadoop.com h3n3 Node Manager
Zookeeper
192.168.56.184 h3n4.hadoop.com h3n4 Namenode,Datanode Node Manager
192.168.56.185 h3n5.hadoop.com h3n5 Datanode Node Manager
192.168.56.186 h3edge.hadoop.com h3edge Edge Node, NFS Server Eco System Tools
TAR Balls need to be downloaded for this installation:
TAR Ball Name Download Location TAR Ball location in VM

hadoop-3.0.0-alpha4.tar https://archive.apache.org/dist/hadoop/core/hadoop-3.0.0-alpha4/ /var/www/html/hadoop_tools/
Clone the VMs and change the IPs addresses as above.
1. Setup Password-less SSH for hadoop cluster installation
NOTE: This step should be followed in all the masters (Active NN, Stand by NN, RM, etc)
rm -rf ~/.ssh/id_rsa*
ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa
ls -ltr ~/.ssh
for i in 192.168.56.{181,182,183,184,185,186}; do sshpass -p welcome1 ssh-copy-id $i; done
2. Add below on any one of the master host
sudo vi /etc/clustershell/groups.d/local.cfg
nn: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184

jn: 192.168.56.181 192.168.56.182 192.168.56.183
dn: 192.168.56.183 192.168.56.184 192.168.56.185
zk: 192.168.56.181 192.168.56.182 192.168.56.183
rm: 192.168.56.181 192.168.56.182 192.168.56.183
hadoop: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184 192.168.56.185
all: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184 192.168.56.185 192.168.56.186
sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg

192.168.56.182:/etc/clustershell/groups.d/local.cfg
By: Venkata Narasimha Rao B, Contact: +91 9342707000

Hadoop 3.x Installation with HA – Automatic Failover
clush -g all -b "date"
3. Configure NTPD Service.
clush -g all -b "sudo sed -i 's/^server /#server /g' /etc/ntp.conf"

clush -g all -x 192.168.56.181 -b "echo 'server 192.168.56.181 prefer' | sudo tee -a /etc/ntp.conf >
/dev/null 2>&1"
If you don't have internet access to your hosts:
clush -w 192.168.56.181 -b "echo 'server 127.127.1.0' | sudo tee -a /etc/ntp.conf > /dev/null 2>&1"
clush -w 192.168.56.181 -b "echo 'fudge 127.127.1.0 stratum 10' | sudo tee -a /etc/ntp.conf > /dev/null
2>&1"
Restart NTPD Service & Sync Time:
clush -g all -b "sudo systemctl restart ntpd"

clush -g all -x 192.168.56.181 -b "/usr/sbin/ntpdate -d 192.168.56.181"
clush -g all -x 192.168.56.181 -b "/usr/sbin/ntpq -p"
clush -g all -b "date"
4. Download hadoop 3x tarball
Download hadoop-3.0.0-alpha4.tar.gz from internet and untar as below
http://mirror.fibergrid.in/apache/hadoop/common/
From Clush node:
clush -g all -b "sudo unlink /usr/local/hadoop" > /dev/null 2>&1;

clush -g all -b "sudo rm -rf /usr/local/hadoop-3.0.0-alpha4"
clush -g all -b "sudo tar -xvzf /var/www/html/hadoop_tools/hadoop-3.0.0-alpha4.tar.gz -C /usr/local/"
clush -g all -b "du -sch /usr/local/hadoop-3.0.0-alpha4"
clush -g all -b "sudo ln -s /usr/local/hadoop-3.0.0-alpha4 /usr/local/hadoop"
clush -g all -b "sudo chown -R hdpuser:hdpadmin /usr/local/hadoop*"
clush -g all -b "ls -ltr /usr/local | grep -i hadoop"
6. Setup HOME paths
sudo vi /etc/profile
Copy below to the end of file.
export JAVA_HOME=/usr/java/default
export ZOOKEEPER_HOME=/usr/local/zookeeper
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_HOME_WARN_SUPPRESS=1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_ROOT_LOGGER="WARN,DRFA"

export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export YARN_HOME=$HADOOP_HOME
export YARN_HOME_WARN_SUPPRESS=1
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME_WARN_SUPPRESS=1
export HADOOP_COMMON_HOME=$HADOOP_HOME
PATH=$PATH:$HOME/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${ZOOKEEPER_HOME}/bin:${J
AVA_HOME}/bin
export PATH
Copy /etc/profile file to all other nodes from h3n1
clush -g all -x 192.168.56.181 --copy /etc/profile --dest /tmp/

clush -g all -x 192.168.56.181 "sudo cp /tmp/profile /etc/"
clush -g all -b "source /etc/profile"
Add above EXPORT commands to env files (excluding PATH).
sudo vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
sudo vi /usr/local/hadoop/etc/hadoop/yarn-env.sh
7. Change xml files as below
sudo vi /usr/local/hadoop/etc/hadoop/core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>h3n1.hadoop.com:2181,h3n2.hadoop.com:2181,h3n3.hadoop.com:2181</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://h3n1.hadoop.com:8485;h3n2.hadoop.com:8485;h3n3.hadoop.com:8485/mycluster</v
alue>
</property>
<property>

<name>topology.script.file.name</name>
<value>/usr/local/hadoop/etc/hadoop/topology.sh</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>360</value>
</property>
<property>
<name>fs.trash.checkpoint.interval</name>
<value>2</value>
</property>
sudo vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>h3n1,h3n2,h3n3, h3n4</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.h3n1</name>
<value>h3n1.hadoop.com:9000</value>
</property>
<property>
</property>
<property>
</property>
<property>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.h3n1</name>
</property>

<property>
</property>
<property>
</property>
<property>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.block.size</name>
<value>268435456</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/mnt/disk1/name </value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/mnt/disk1/data,file:/mnt/disk2/data,file:/mnt/disk3/data</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/mnt/disk1/jnedits</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hdpuser/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/mnt/disk1/snn</value>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>/mnt/disk1/snn</value>
</property>
<property>
<name>dfs.namenode.checkpoint.period</name>
<value>3600</value>
</property>
<property>
<name>dfs.ha.log-roll.period</name>
<value>600</value>
</property>
<property>
<name>dfs.acl.enable</name>
<value>true</value>
</property>
sudo vi /usr/local/hadoop/etc/hadoop/slaves
192.168.56.183

192.168.56.184
192.168.56.185
Create directories in namenodes and datanodes. From Clush node:
clush -g nn -b "sudo mkdir -p /mnt/disk1/name"

clush -g jn -b "sudo mkdir -p /mnt/disk1/jnedits"
clush -g nn -b "sudo mkdir -p /mnt/disk1/snn"
clush -g dn -b "sudo mkdir -p /mnt/disk1/data"
clush -g all -b "sudo chown -R hdpuser:hdpadmin /mnt"
Copy all the XMLs to other nodes.
clush -g all -x 192.168.56.181 "sudo rm -rf /tmp/hadoop"

clush -g all -x 192.168.56.181 --copy /usr/local/hadoop/etc/hadoop --dest /tmp/
clush -g all -x 192.168.56.181 "sudo cp -r /tmp/hadoop /usr/local/hadoop/etc/"
8. Create Log directories to store logs and copy updated xmls to other machines.
clush -g all -b "sudo mkdir /usr/local/hadoop/logs"

clush -g all -b "sudo chmod 777 -R /usr/local/hadoop/logs"
clush -g all -b "sudo chown hdpuser:hdpadmin -R /usr/local/hadoop/logs"
9. Setting up zookeeper.
Add below on the master host
clush -g zk -b "date"
clush -g zk -b "sudo unlink /usr/local/zookeeper" > /dev/null 2>&1;

clush -g zk -b "sudo rm -rf /usr/local/zookeeper-3.4.6"
clush -g zk -b "sudo tar -xvzf /var/www/html/hadoop_tools/zookeeper-3.4.6.tar.gz -C /usr/local/"
clush -g zk -b "du -sch /usr/local/zookeeper-3.4.6"
clush -g zk -b "sudo ln -s /usr/local/zookeeper-3.4.6 /usr/local/zookeeper"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /usr/local/zookeeper*"
clush -g zk -b "ls -ltr /usr/local/ | grep -i zookeeper"
Change the zookeeper configuration file
sudo cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg

sudo sed -i 's/^dataDir/#dataDir/g' /usr/local/zookeeper/conf/zoo.cfg
sudo vi /usr/local/zookeeper/conf/zoo.cfg
Comment dataDir property and add below at the end of the file:
dataDir=/mnt/disk1/zkdata
server.1=h3n1.hadoop.com:2888:3888

Copy zookeeper folder to all other hosts
clush -g zk -x 192.168.56.181 "sudo rm -rf /tmp/conf "

clush -g zk -x 192.168.56.181 --copy /usr/local/zookeeper/conf --dest /tmp/
clush -g zk -x 192.168.56.181 "sudo cp -r /tmp/conf /usr/local/zookeeper/"
clush -g zk -b " echo; echo -e "ZOO_LOG_DIR=/usr/local/zookeeper/logs" | sudo tee -a

/usr/local/zookeeper/bin/zkEnv.sh > /dev/null"
clush -g zk -b "sudo mkdir -p /mnt/disk1/zkdata"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /mnt/disk1/zkdata"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /usr/local/zookeeper*"
clush -g zk -b "sudo touch /mnt/disk1/zkdata/myid"
clush -w 192.168.56.181 -b "echo 1 | sudo tee /mnt/disk1/zkdata/myid > /dev/null"
clush -g zk -b "cat /mnt/disk1/zkdata/myid"
zookeeper myid file should show as below
h3n1 - 1
h3n2 - 2
h3n3 - 3
Start Zookeeper in all the nodes.
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"
To check whether zookeeper is working fine.
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh status"
or in each node, with their hostname
zkCli.sh -server h3n1.hadoop.com:2181
clush -g all -b "jps | grep -v Jps; echo;"
9. Setup Rack Topology
Rack Awareness: Create topology.sh file as below.
sudo vi /usr/local/hadoop/etc/hadoop/topology.sh
#==================================
while [ $# -gt 0 ] ; do
nodeArg=$1
exec< /usr/local/hadoop/etc/hadoop/topology.data
result=""

while read line ; do

ar=( $line )
if [ "${ar[0]}" = "$nodeArg" ] ; then
result="${ar[1]}"
fi
done
shift
if [ -z "$result" ] ; then
echo -n "/default"
else
echo -n "$result "
fi
done
#==================================
sudo chmod 755 /usr/local/hadoop/etc/hadoop/topology.sh
Create topology.data file as below.
sudo vi /usr/local/hadoop/etc/hadoop/topology.data
192.168.56.181 /rack1
192.168.56.182 /rack2
192.168.56.183 /rack1
192.168.56.184 /rack2
192.168.56.185 /rack2
hdfs dfsadmin -printTopology
10. Start Hadoop Daemons
Format ZKFC service:
clush -g nn -b "/usr/local/hadoop/bin/hdfs zkfc -formatZK -force"

clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
Start Journal Nodes:
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
Only for the first time activities:
In Active NN:
hdfs namenode -format

hdfs --daemon start namenode
In all Standby NNs:
hdfs namenode -bootstrapStandby

hdfs --daemon start namenode
Check below folders:
clush -g nn -b "ls /mnt/disk1/name/current/"

clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/"
Start DataNodes
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
Check status of each Name Node
hdfs haadmin -getServiceState h3n1

Fail over to another node.
hdfs haadmin -failover h3n1 h3n2

Check below folders:

clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/current/"
clush -g dn -b "ls /mnt/disk1/data/current/"
To save namespace
hdfs dfsadmin -safemode enter

hdfs dfsadmin -saveNamespace
hdfs dfsadmin -safemode leave
10. Start Hadoop Storage Layer

Stop if any services started earlier.
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon stop datanode"

clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop namenode"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon stop journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop zkfc"
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh stop"
To start entire cluster:

clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start namenode"
To see the fsimage & edits files

clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/current/"
seen_txid: This contains the last transaction ID of the last checkpoint (merge of edits into a fsimage) or edit log roll
(finalization of current edits_inprogress and creation of a new one). The file is not updated on every transaction,
only on a checkpoint or an edit log roll.
committed-txid: Tracks last transaction ID committed by a NameNode
last-promised-epoch: When NN becomes active, it increments the last-promised-epoch. While writing edits to Edit
log, NN will send this epoch to JN to confirm the latest Active NN. Edits from previous Active will be discorded.
last-writer-epoch: This contains the epoch number associated with the NN who last actually wrote a transaction.
Command to roll the edits manually:
hdfs dfsadmin -rollEdits
11. Some interesting points about storage layer
Check Default Hadoop values in 2.X:
sudo jar -tf /usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-alpha4.jar | grep core-

sudo jar -tf /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-alpha4.jar | grep hdfs-
sudo jar -xf /usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-alpha4.jar core-

default.xml
sudo jar -xf /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-alpha4.jar hdfs-default.xml

14. To start YARN
clush -g rm -b "date"
In H3N1,
sudo vi /usr/local/hadoop/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mycluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>h3n1,h3n2,h3n3</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.h3n1</name>
<value>h3n1.hadoop.com</value>
</property>
<property>
</property>
<property>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.h3n1</name>

</property>
<property>
</property>
<property>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>h3n1.hadoop.com:2181,h3n2.hadoop.com:2181,h3n3.hadoop.com:2181</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.client.failover-proxy-provider</name>
<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/apps/yarn/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>1296000</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASS
PATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>

</property>
sudo cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml
sudo vi /usr/local/hadoop/etc/hadoop/mapred-site.xml
add below between <configuration> and </configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
</property>
Copy all the XMLs to other nodes.
clush -g all -x 192.168.56.181 "sudo rm -rf /tmp/hadoop"

clush -g all -x 192.168.56.181 --copy /usr/local/hadoop/etc/hadoop --dest /tmp/
clush -g all -x 192.168.56.181 "sudo cp -r /tmp/hadoop /usr/local/hadoop/etc/"
Start YARN daemons:
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon stop resourcemanager"

clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon stop nodemanager"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon start resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon start nodemanager"
In H3N1:
mapred --daemon stop historyserver
mapred --daemon start historyserver
yarn rmadmin -getServiceState h3n1

To stop entire cluster:
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon stop historyserver"




clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon start historyserver"
http://192.168.56.181:9870
http://192.168.56.181:8088
To maintain, Log aggregation,
hdfs dfs -mkdir -p /apps/yarn/logs

hdfs dfs -chmod -R 777 /apps
hdfs dfs -mkdir -p /tmp
hdfs dfs -chmod -R 777 /tmp
hdfs dfs -ls /
Check jars are working fine or not.
yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar
Running a mapreduce program:
hdfs dfs -rm -r /out

yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar
wordcount /sample.txt /out
Check for MRAppMaster, YarnChild by checking jps command as below while running the job. We can see how RM
is launching these and running the jobs.
Check YARN logs thru UI:
http://192.168.56.181:8088/

If you are using VMs from Windows, add the host details to C:\Windows\System32\drivers\etc to resolve
the hostname and show logs.
Check Old YARN logs thru UI:
http://h3n1.hadoop.com:19888/
This is history server URL.
Check YARN Logs thru command line:
yarn application -list -appStates ALL
yarn logs -applicationId application_1464914540546_0002
Check for tracking URL and see the logs in browser.
To stop entire cluster:
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon stop historyserver"


clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon start historyserver"
20. To delete entire Hadoop installation,
clush -g all -b "sudo unlink /usr/local/hadoop" > /dev/null 2>&1;

clush -g all -b "sudo rm -rf /usr/local/hadoop*"
clush -g zk -b "sudo unlink /usr/local/zookeeper" > /dev/null 2>&1;
clush -g zk -b "sudo rm -rf /usr/local/zookeeper*"
clush -g nn -b "sudo umount -l /mnt/disk1/nfsedits" > /dev/null 2>&1;
clush -g all -b "sudo rm -rf /mnt/disk1/*"


clush -g all -b "sudo ls /mnt/*"
clush -g all -b "sudo sed -i '/JAVA_HOME/,\$d' /etc/profile"
clush -g nn -b "sudo sed -i '/nn:/,\$d' /etc/clustershell/groups.d/local.cfg"
For any questions, email to narasimha.v.rao.b@gmail.com

Hadoop 3.x HA Installation with Automatic Failover

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Hadoop 3.x HA Installation with Automatic Failover

Caricato da

Copyright:

Formati disponibili

Hadoop 3.

x Installation with HA – Automatic Failover

IP Address FQDN Hostname Role in Storage Layer Role in Processing Layer

TAR Balls need to be downloaded for this installation:

TAR Ball Name Download Location TAR Ball location in VM

Clone the VMs and change the IPs addresses as above.

1. Setup Password-less SSH for hadoop cluster installation

2. Add below on any one of the master host

nn: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184

sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg

By: Venkata Narasimha Rao B, Contact: +91 9342707000

clush -g all -b "date"

3. Configure NTPD Service.

clush -g all -b "sudo sed -i 's/^server /#server /g' /etc/ntp.conf"

If you don't have internet access to your hosts:

Restart NTPD Service & Sync Time:

clush -g all -b "sudo systemctl restart ntpd"

clush -g all -b "date"

4. Download hadoop 3x tarball

Download hadoop-3.0.0-alpha4.tar.gz from internet and untar as below

From Clush node:

clush -g all -b "sudo unlink /usr/local/hadoop" > /dev/null 2>&1;

6. Setup HOME paths

Copy below to the end of file.

By: Venkata Narasimha Rao B, Contact: +91 9342707000

Copy /etc/profile file to all other nodes from h3n1

clush -g all -x 192.168.56.181 --copy /etc/profile --dest /tmp/

clush -g all -b "source /etc/profile"

Add above EXPORT commands to env files (excluding PATH).

7. Change xml files as below

By: Venkata Narasimha Rao B, Contact: +91 9342707000

By: Venkata Narasimha Rao B, Contact: +91 9342707000

By: Venkata Narasimha Rao B, Contact: +91 9342707000

By: Venkata Narasimha Rao B, Contact: +91 9342707000

Create directories in namenodes and datanodes. From Clush node:

clush -g nn -b "sudo mkdir -p /mnt/disk1/name"

Copy all the XMLs to other nodes.

clush -g all -x 192.168.56.181 "sudo rm -rf /tmp/hadoop"

clush -g all -b "sudo mkdir /usr/local/hadoop/logs"

Add below on the master host

clush -g zk -b "sudo unlink /usr/local/zookeeper" > /dev/null 2>&1;

Change the zookeeper configuration file

sudo cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg

By: Venkata Narasimha Rao B, Contact: +91 9342707000

Copy zookeeper folder to all other hosts

clush -g zk -x 192.168.56.181 "sudo rm -rf /tmp/conf "

clush -g zk -b " echo; echo -e "ZOO_LOG_DIR=/usr/local/zookeeper/logs" | sudo tee -a

zookeeper myid file should show as below

Start Zookeeper in all the nodes.

clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"

To check whether zookeeper is working fine.

clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh status"

or in each node, with their hostname

zkCli.sh -server h3n1.hadoop.com:2181

clush -g all -b "jps | grep -v Jps; echo;"

9. Setup Rack Topology

Rack Awareness: Create topology.sh file as below.

By: Venkata Narasimha Rao B, Contact: +91 9342707000

while read line ; do

sudo chmod 755 /usr/local/hadoop/etc/hadoop/topology.sh

Create topology.data file as below.

hdfs dfsadmin -printTopology

10. Start Hadoop Daemons