HDFS Command Syntax

HDFS Command Syntax Overview: hadoop fs: Ex.: hadoop fs -ls
hadoop version : check hadoop installed properly

HELP: help [cmd]: hopefully this is self-describing

Inspect files: ls/lsr : list all files in cat : print on stdout tail [-f] : output the last part of the
test : return attributes of file and directory
touchz : create new emty file size 0 du/dus : show space utilization
count : no. of directories, files, and bytes
setrep : (-r) change the replication factor of file/directory
stat : info about the specified path
Create/remove files: mkdir : create a directory mv : move (rename) files cp : copy files rm/rmr : remove files Copy/Put files from remote m/c into the HADOOP cluster: copyFromLocal : copy a local file to the HDFS copyToLocal : copy a file on the HDFS to the local disk
cp : copies one or more files
get : copies files to the local file system
put : copies files from the local file system
mv : moves one or more files

Hadoop Namenode Commands: hadoop namenode -format: Format HDFS filesystem from Namenode hadoop namenode -upgrade: Upgrade the NameNode start-dfs.sh Start: HDFS Daemons stop-dfs.sh Stop: HDFS Daemons start-mapred.sh: Start: MapReduce Daemons stop-mapred.sh Stop: MapReduce Daemons hadoop namenode -recover -force: Recover namenode metadata after a cluster failure (may lose data)

Hadoop Configuration Files: core-site.xml : Parameters for entire Hadoop cluster hdfs-site.xml : Parameters for HDFS and its clients mapred-site.xml : Parameters for MapReduce and its clients
yarn-site.xml : Parameters for nodemanager and resource manager
masters : Host machines for secondary Namenode slaves : List of slave hosts
hadoop-env.sh : Sets ENV variables for Hadoop
set JAVA_HOME=%JAVA_HOME% set HADOOP_PREFIX=D:\Hadoop
Hadoop Job Commands hadoop job -submit : Submit the job hadoop job -status : Print job status completion percentage hadoop job -list all : List all jobs hadoop job -list-active-trackers : List all available TaskTrackers hadoop job -set-priority : Set priority for a job. Valid priorities : VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW hadoop job -kill-task : Kill a task hadoop job -history : Display job history including job details, failed and killed jobs Hadoop mradmin Commands hadoop mradmin -safemode get : Check Job tracker status hadoop mradmin -refreshQueues : Reload mapreduce configuration hadoop mradmin -refreshNodes : Reload active TaskTrackers hadoop mradmin -refreshServiceAcl : Force Jobtracker to reload service ACL hadoop mradmin -refreshUserToGroupsMappings : Force jobtracker to reload user group mappings Hadoop fsck Commands hadoop fsck / : Filesystem check on HDFS hadoop fsck / -files : Display files during check hadoop fsck / -files -blocks : Display files and blocks during check hadoop fsck / -files -blocks -locations : Display files, blocks and its locationhadoop fsck / -files -blocks -locations -racks : Display network topology for data-node locations hadoop fsck -delete : Delete corrupted files hadoop fsck -move : Move corrupted files to /lost+found directory
Hadoop Balancer Commands start-balancer.sh : Balance the cluster hadoop dfsadmin -setBalancerBandwidth : Adjust bandwidth used by the balancer hadoop balancer -threshold 20 : Limit balancing to only 20% resources in the cluster
Hadoop Safe Mode (Maintenance Mode) Commands The following dfsadmin commands helps the cluster to enter or leave safe mode, which is also called as maintenance mode. In this mode, Namenode does not accept any changes to the name space, it does not replicate or delete blocks. hadoop dfsadmin -safemode enter : Enter safe mode hadoop dfsadmin -safemode leave : Leave safe mode hadoop dfsadmin -safemode get : Get the status of mode hadoop dfsadmin -safemode wait : Wait until HDFS finishes data block replication hadoop dfsadmin -report : total usage on the cluster

Launching Hadoop Jobs: hadoop jar [mainClass] args... :Launch job via jar file
hadoop jar com.twitter.scalding.Tool [mainClass] args : A Scalding job is launched using
mapred job -kill : If you need to kill a map-reduce job

Commonly Used Administration Commands: Format the namenode: hadoop namenode -format Starting Secondary namenode: hadoop secondrynamenode Run namenode : hadoop namenode Run data node: hadoop datanode Cluster Balancing: hadoop balancer Run MapReduce job tracker node: hadoop jobtracker Run MapReduce task tracker node: hadoop tasktracker

Start/Stop Yarn (starts resourcemanager and nodemanager)and DFS (Starts namenode and data node) from sbin directory: start-yarn, stop-yarn start-dfs, stop-dfs
Start and Stop ALL daemon from sbin directory: start-all, stop-all

Check All 5 daemons (Namenode,Secoundary Node,Job Tracker, DataNode, Task Tracker ) are up:
jps

Recent Posts

Pages

Categories

Archives

Recent Comments

Categories

HDFS Command Syntax

Leave a Reply Cancel reply

PRODUCTS