Sei sulla pagina 1di 6

Home

Free eBook

Start Here

Contact

About

Quick Apache Hadoop Admin Command Reference


Examples
by KARTHIKEYAN SADHASIVAM on FEBRUARY 18, 2015

If you are working on Hadoop, youll realize


there are several shell commands available to manage your hadoop cluster.
This article provides a quick handy reference to all Hadoop administration commands.
If you are new to big data, read the introduction to Hadoop article to understand the
basics.

1. Hadoop Namenode Commands


Command

Description

hadoop namenode -format


hadoop namenode -upgrade
start-dfs.sh
stop-dfs.sh
start-mapred.sh
stop-mapred.sh
hadoop namenode -recover

Format HDFS filesystem from Namenode


Upgrade the NameNode
Start HDFS Daemons
Stop HDFS Daemons
Start MapReduce Daemons
Stop MapReduce Daemons
Recover namenode metadata after a cluster failure

-force

(may lose data)

2. Hadoop fsck Commands


Command

Description

hadoop fsck /
hadoop fsck / -files
hadoop fsck / -files -blocks
hadoop fsck / -files -blocks
-locations
hadoop fsck / -files -blocks
-locations -racks
hadoop fsck -delete

Filesystem check on HDFS


Display files during check
Display files and blocks during check
Display files, blocks and its location
during check
Display network topology for data-node
locations
Delete corrupted files
Move corrupted files to /lost+found
directory

hadoop fsck -move

3. Hadoop Job Commands


Command
hadoop job -submit <jobfile>
hadoop job -status <jobid>
hadoop job -list all
hadoop job -list-activetrackers

Description
Submit the job
Print job status completion percentage
List all jobs
List all available TaskTrackers

hadoop job -set-priority


<job-id> <priority>

Set priority for a job. Valid priorities:


VERY_HIGH, HIGH, NORMAL, LOW,
VERY_LOW

hadoop job -kill-task


<task-id>

Kill a task

hadoop job -history

Display job history including job details, failed and


killed jobs

4. Hadoop dfsadmin Commands

Command

Description

hadoop dfsadmin -report


hadoop dfsadmin
-metasave file.txt
hadoop dfsadmin
-setQuota 10 /quotatest
hadoop dfsadmin
-clrQuota /quotatest

Report filesystem info and statistics

hadoop dfsadmin
-refreshNodes
hadoop fs -count -q
/mydir
hadoop dfsadmin
-setSpaceQuota /mydir
100M
hadoop dfsadmin
-clrSpaceQuota /mydir
hadooop dfsadmin
-saveNameSpace

Save namenodes primary data structures to file.txt


Set Hadoop directory quota to only 10 files
Clear Hadoop directory quota
Read hosts and exclude files to update datanodes that
are allowed to connect to namenode. Mostly used to
commission or decommsion nodes
Check quota space on directory /mydir
Set quota to 100M on hdfs directory named /mydir
Clear quota on a HDFS directory
Backup Metadata (fsimage & edits). Put cluster in
safe mode before this command.

5. Hadoop Safe Mode (Maintenance Mode) Commands


The following dfsadmin commands helps the cluster to enter or leave safe mode, which
is also called as maintenance mode. In this mode, Namenode does not accept any
changes to the name space, it does not replicate or delete blocks.

Command
hadoop dfsadmin -safemode
enter
hadoop dfsadmin -safemode
leave
hadoop dfsadmin -safemode get

Description
Enter safe mode
Leave safe mode

Get the status of mode


Wait until HDFS finishes data block
hadoop dfsadmin -safemode wait
replication

6. Hadoop Configuration Files


File

Description

hadoop-env.sh
core-site.xml
hdfs-site.xml
mapred-site.xml
masters
slaves

Sets ENV variables for Hadoop


Parameters for entire Hadoop cluster
Parameters for HDFS and its clients
Parameters for MapReduce and its clients
Host machines for secondary Namenode
List of slave hosts

7. Hadoop mradmin Commands


Command

Description

hadoop mradmin -safemode get


hadoop mradmin -refreshQueues
hadoop mradmin -refreshNodes

Check Job tracker status


Reload mapreduce configuration
Reload active TaskTrackers
Force Jobtracker to reload service
ACL
Force jobtracker to reload user
group mappings

hadoop mradmin -refreshServiceAcl


hadoop mradmin
-refreshUserToGroupsMappings

8. Hadoop Balancer Commands


Command

Description

start-balancer.sh
hadoop dfsadmin -setBalancerBandwidth
<bandwidthinbytes>

Balance the cluster


Adjust bandwidth used by the
balancer
Limit balancing to only 20%
resources in the cluster

hadoop balancer -threshold 20

9. Hadoop Filesystem Commands


Command

Description

hadoop fs -mkdir mydir


hadoop fs -ls

Create a directory (mydir) in HDFS


List files and directories in HDFS

hadoop fs -cat myfile


hadoop fs -du
hadoop fs -expunge
hadoop fs -chgrp hadoop file1
hadoop fs -chown huser file1
hadoop fs -rm file1
hadoop fs -touchz file2
hadoop fs -stat file1
hadoop fs -test -e file1
hadoop fs -test -z file1
hadoop fs -test -d file1

View a file content


Check disk space usage in HDFS
Empty trash on HDFS
Change group membership of a file
Change file ownership
Delete a file in HDFS
Create an empty file
Check the status of a file
Check if file exists on HDFS
Check if file is empty on HDFS
Check if file1 is a directory on HDFS

10. Additional Hadoop Filesystem Commands


Command

Description

hadoop fs -copyFromLocal <source>


<destination>

Copy from local fileystem to


HDFS
e.g: Copies file1 from local FS
to data dir in HDFS
copy from hdfs to local
filesystem
e.g: Copies file1 from HDFS
data directory to /var/tmp on
local FS
Copy from remote location to
HDFS
Copy from HDFS to remote
directory
Copy data from one cluster to
another using the cluster URL
Move data file from the local
directory to HDFS
Set the replication factor for
file1 to 3
Merge files in mydir directory
and download it as one big file

hadoop fs -copyFromLocal file1 data


hadoop fs -copyToLocal <source>
<destination>
hadoop fs -copyToLocal data/file1 /var/tmp
hadoop fs -put <source> <destination>
hadoop fs -get <source> <destination>
hadoop distcp hdfs://192.168.0.8:8020/input
hdfs://192.168.0.8:8020/output
hadoop fs -mv file:///data/datafile
/user/hduser/data
hadoop fs -setrep -w 3 file1
hadoop fs -getmerge mydir bigfile

Potrebbero piacerti anche