0% found this document useful (0 votes)
46 views10 pages

HDFS (Hadoop Distributed File System) : HDFS Architecture Components of The Architecture

Uploaded by

Prabir Kisku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views10 pages

HDFS (Hadoop Distributed File System) : HDFS Architecture Components of The Architecture

Uploaded by

Prabir Kisku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

HDFS (Hadoop Distributed File System)

HDFS Architecture
Components of the Architecture :
NameNode : The master server , mainly stores the meta data and the
information about all the data nodes, etc.

Secondary Namenode : Provides a checkpoint in HDFS with fsImage


and EditLogs.
DataNode : i) Datanodes perform read-write operations on the file
systems, as per client request. ii) They also perform operations such as
block creation, deletion, and replication as instructed by the
namenode.

Block : The file segments where user data is actually stored are called
as blocks. The default size is 64 MB.

Working with Hadoop – Starting HDFS

Introduction:

The HDFS (Hadoop Distributed File System) is the distributed fault


tolerant file system that can hold and process data which is really Big
for us. We discussed earlier about the concept of Big as it appears to
us....

We will be starting the Hadoop system and the HDFS. The commands
will start the NameNode, Secondary NameNode, DataNode, JobTracker,
TaskTracker, jps. We have already discussed about the basic arhitecture
of HDFS.

Pre Requisites:

Hadoop System should be properly installed in the system either as a


SingleNode Cluster or a MultiNode Cluster.

Sequences of Operations to start Hadoop Distributed File


System(HDFS)

Step 1 : To start HDFS we, use:


$> ~/hadoop-1.0.3/bin/start-all.sh [ Use appropriate Hadoop version as
per your installation. For Hadoop 2.x users, you can use : i) start-dfs.sh
ii) start-yarn.sh]

Step 2 : To check the status of the services started, the command used
is :

$> jps

It shows the start of 6 services as:

NameNode, JobTracker, Jps, TaskTracker, SecondaryNameNode,


DataNode.
Step 3: Ensure that the NameNode is NOT in safemode for proper
operations to be performed on HDFS. We use the following command
to set the safemode off for HDFS.

$> ~/hadoop-1.0.3/bin/hadoop dfsadmin -safemode leave

Step 4: To check the HDFS file system, use the browser and browse with
the following URL:

http://localhost:50070/dfshealth.jsp
The screen shot displays the HDFS in the browser.

Working with Hadoop Distributed File System – Using FS Shell


Commands
Introduction:

The FileSystem shell commands provides all the basic commands


needed to operate on file(s) and data between hdfs and local file
system. It is invoked by bin/hadoop fs commands. All the FS shell
commands take path URIs as arguments.

These shell commands needs the hadoop to be started normally, and


the safemode of the namenode to be turned off.

File System(FS) Shell Commands:


The following presents a list of syntaxes of the most important file
system commands.

1. cat command :

Usage: hadoop fs -cat URI [URI …]

Copies source paths to stdout.

Example:

hadoop fs -cat hdfs://user/hadoop/file1 hdfs://user/hadoop/file2

Exit Code:
Returns 0 on success and -1 on error.

2. chgrp command :

Usage: hadoop fs -chgrp [-R] GROUP URI [URI …]

Change group association of files. With -R, make the change recursively
through the directory structure. The user must be the owner of files, or
else a super-user.

3. chmod command:

Usage: hadoop fs -chmod [-R] URI [URI …]

Change the permissions of files. With -R, make the change recursively
through the directory structure. The user must be the owner of the file,
or else a super-user.

4. chown command:

Usage: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]


Change the owner of files. With -R, make the change recursively
through the directory structure, being the owner of the file or super-
user

5. copyFromLocal command:

Usage: hadoop fs -copyFromLocal URI

This copies file(s) from local directory to exsiting file reference at HDFS.

6. copyToLocal

Usage: hadoop fs -copyToLocal [-ignorecrc] [-crc] URI

Copies file(s) from HDFS to existing local file reference.

7. cp

Usage: hadoop fs -cp URI [URI …]

Copy files from source to destination. This command allows multiple


sources as well in which case the destination must be a directory.
Example:

1. hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2


2. hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
/user/hadoop/dir

Exit Code: Returns 0 on success and -1 on error.

8. get

Usage: hadoop fs -get [-ignorecrc] [-crc]


Copy files to the local file system. Files that fail the CRC check may be
copied with the -ignorecrc option. Files and CRCs may be copied using
the -crc option.

Example:

 hadoop fs -get /user/hadoop/file localfile

Exit Code: Returns 0 on success and -1 on error.

9. mkdir

Usage: hadoop fs -mkdir

Takes path uri's as argument and creates directories. The behavior is


much like unix mkdir -p creating parent directories along the path.

Example:

 hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2

Exit Code:

Returns 0 on success and -1 on error.

10. mv

Usage: hadoop fs -mv URI [URI …]

Moves files from source to destination. This command allows multiple


sources as well in which case the destination needs to be a directory.
Moving files across filesystems is not permitted.
Example:

 hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2


Exit Code:

Returns 0 on success and -1 on error.

11. put

Usage: hadoop fs -put ...

Copy single src, or multiple srcs from local file system to the destination
filesystem. Also reads input from stdin and writes to destination
filesystem.

 hadoop fs -put localfile /user/hadoop/hadoopfile


 hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir

Exit Code:

Returns 0 on success and -1 on error.

12. rm

Usage: hadoop fs -rm URI [URI …]

Delete files specified as args. Only deletes non empty directory and
files. Refer to rmr for recursive deletes.
Example:

 hadoop fs -rm hdfs://nn.example.com/file


/user/hadoop/emptydir
Exit Code:

Returns 0 on success and -1 on error.

13. rmr

Usage: hadoop fs -rmr URI [URI …]

Recursive version of delete.


Example:

 hadoop fs -rmr /user/hadoop/dir

Exit Code:

Returns 0 on success and -1 on error.

Conclusion:

The above provides a list of the most important commands to be used


from the HDFS shell, to work with files and directories.

You might also like