0% found this document useful (0 votes)

8 views

Hadoop Installation

Uploaded by

sthasumit96

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Hadoop Installation

Uploaded by

sthasumit96

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Hadoop Installation on Ubuntu (Linux)

Live USB Installation

The following installation is extracted from

https://www.vultr.com/docs/install-and-configure-apache-hadoop-on-ubuntu-20-04

author: Thomas Rakwach

1. Install Java

Install the latest version of Java.

$ sudo apt install default-jdk default-jre -y

Verify the installed version of Java.

$ java -version

2. Create Hadoop User and Configure Password-

less SSH

Add a new user hadoop.

$ sudo adduser hadoop

Add the hadoop user to the sudo group.

$ sudo usermod -aG sudo hadoop

Switch to the created user.

$ sudo su - hadoop

Install the OpenSSH server and client.

$ sudo apt install openssh-server openssh-client -y

Sham Shul Shukri Mat

When you get a prompt, respond with:

keep the local version currently installed

Switch to the created user.

$ sudo su - hadoop

Generate public and private key pairs.

$ ssh-keygen -t rsa

Add the generated public key from id_rsa.pub to authorized_keys.

$ sudo cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Change the permissions of the authorized_keys file.

$ sudo chmod 640 ~/.ssh/authorized_keys

Verify if the password-less SSH is functional.

$ ssh localhost

3. Install Apache Hadoop

Log in with hadoop user.

$ sudo su - hadoop

Download the latest stable version of Hadoop. To get the latest version, go to Apache
Hadoop official download page.

$ sudo wget https://downloads.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

Extract the downloaded file.

$ tar -xvzf hadoop-3.3.1.tar.gz

Sham Shul Shukri Mat

Move the extracted directory to the /usr/local/ directory.

$ sudo mv hadoop-3.3.1 /usr/local/hadoop

Create directory to store system logs.

$ sudo mkdir /usr/local/hadoop/logs

Change the ownership of the hadoop directory.

$ sudo chown -R hadoop:hadoop /usr/local/hadoop

4. Configure Hadoop

Edit file ~/.bashrc to configure the Hadoop environment variables.

$ sudo nano ~/.bashrc

Add the following lines to the file. Save and close the file.

export HADOOP_HOME=/usr/local/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

Activate the environment variables.

$ source ~/.bashrc

5. Configure Java Environment Variables

Hadoop has a lot of components that enable it to perform its core functions. To configure
these components such as YARN, HDFS, MapReduce, and Hadoop-related project settings,
you need to define Java environment variables in hadoop-env.sh configuration file.
Sham Shul Shukri Mat
Find the Java path.

$ which javac

Find the OpenJDK directory.

$ readlink -f /usr/bin/javac

Edit the hadoop-env.sh file.

$ sudo nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Add the following lines to the file. Then, close and save the file.

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
export HADOOP_CLASSPATH+=" $HADOOP_HOME/lib/*.jar"

Browse to the hadoop lib directory.

$ cd /usr/local/hadoop/lib

Download the Javax activation file.

$ sudo wget https://jcenter.bintray.com/javax/activation/javax.activation-

api/1.2.0/javax.activation-api-1.2.0.jar

Verify the Hadoop version.

$ hadoop version

Edit the core-site.xml configuration file to specify the URL for your NameNode.

$ sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml

Add the following lines. Save and close the file.

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:9000</value>
<description>The default file system URI</description>
</property>
</configuration>

Sham Shul Shukri Mat

Create a directory for storing node metadata and change the ownership to hadoop.

$ sudo mkdir -p /home/hadoop/hdfs/{namenode,datanode}

$ sudo chown -R hadoop:hadoop /home/hadoop/hdfs

Edit hdfs-site.xml configuration file to define the location for storing node metadata, fs-
image file.

$ sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the following lines. Close and save the file.

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hdfs/namenode</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hdfs/datanode</value>
</property>
</configuration>

Edit mapred-site.xml configuration file to define MapReduce values.

$ sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the following lines. Save and close the file.

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

Edit the yarn-site.xml configuration file and define YARN-related settings.

$ sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Add the following lines. Save and close the file.

Sham Shul Shukri Mat

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Log in with hadoop user.

$ sudo su - hadoop

Validate the Hadoop configuration and format the HDFS NameNode.

$ hdfs namenode -format

6. Start the Apache Hadoop Cluster

Start the NameNode and DataNode.

$ start-dfs.sh

Start the YARN resource and node managers.

$ start-yarn.sh

Verify all the running components.

$ jps

7. Access Apache Hadoop Web Interface

You can access the Hadoop NameNode on your browser via http://server-IP:9870. For
example:

http://192.0.2.11:9870

Sham Shul Shukri Mat

Learn Excel Data Analysis
100% (15)
Learn Excel Data Analysis
721 pages
BDAO
No ratings yet
BDAO
23 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Install Sqoop
No ratings yet
Install Sqoop
7 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
BDA lab manual UPDATED
No ratings yet
BDA lab manual UPDATED
45 pages
BDA Practical1 MC18-23
No ratings yet
BDA Practical1 MC18-23
17 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
Installationof Hadoop 3
No ratings yet
Installationof Hadoop 3
6 pages
Hadoop Cluster Creation
No ratings yet
Hadoop Cluster Creation
8 pages
DataVisuaization Lab
No ratings yet
DataVisuaization Lab
5 pages
Had Oop Installation
No ratings yet
Had Oop Installation
4 pages
HADOOP 1.X Installation Steps On Ubuntu
No ratings yet
HADOOP 1.X Installation Steps On Ubuntu
3 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Hadoop
No ratings yet
Hadoop
5 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
No ratings yet
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
11 pages
TP2 _3IM - En
No ratings yet
TP2 _3IM - En
7 pages
hbase_installationn
No ratings yet
hbase_installationn
12 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
Hadoop 2.7.3 Setup On Ubuntu 15.10
No ratings yet
Hadoop 2.7.3 Setup On Ubuntu 15.10
7 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
8 pages
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
No ratings yet
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
9 pages
Experiment-2_BDA_Lab
No ratings yet
Experiment-2_BDA_Lab
13 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
Hadoop Installation Commands
No ratings yet
Hadoop Installation Commands
3 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
15 pages
Nitish Steps To Install Hadoop
No ratings yet
Nitish Steps To Install Hadoop
3 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Installing A Single Node Hadoop Cluster
No ratings yet
Installing A Single Node Hadoop Cluster
4 pages
Group A 1st
No ratings yet
Group A 1st
4 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
Anurag 1-6 Merged
No ratings yet
Anurag 1-6 Merged
60 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
7 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
6 Hadoop
No ratings yet
6 Hadoop
20 pages
Hadoop for Ubuntu 2
No ratings yet
Hadoop for Ubuntu 2
4 pages
Aryan
No ratings yet
Aryan
60 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
4 pages
Steps of Hadoop installation
No ratings yet
Steps of Hadoop installation
3 pages
big data
No ratings yet
big data
5 pages
L-3
No ratings yet
L-3
5 pages
213nt1306- Big Data Analytics Lab Manual
No ratings yet
213nt1306- Big Data Analytics Lab Manual
80 pages
Installing Hadoop in Ubuntu in Virtual Box Instructions
No ratings yet
Installing Hadoop in Ubuntu in Virtual Box Instructions
4 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
ASSIGNMENT_TANUPRIYA_BDDV
No ratings yet
ASSIGNMENT_TANUPRIYA_BDDV
8 pages
Hadoop Installation On Linux
No ratings yet
Hadoop Installation On Linux
4 pages
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
Plaxis3dce v20.04 1 Tutorial
No ratings yet
Plaxis3dce v20.04 1 Tutorial
161 pages
Believer by Imagine Dragons Piano Sheet Music Intermediate Level
No ratings yet
Believer by Imagine Dragons Piano Sheet Music Intermediate Level
1 page
Test Points
100% (1)
Test Points
21 pages
UNIT 5 Eh
No ratings yet
UNIT 5 Eh
3 pages
ASTU Graduation Poster
No ratings yet
ASTU Graduation Poster
1 page
C language UNIT-3 C NOTES
No ratings yet
C language UNIT-3 C NOTES
18 pages
第五章：Flink Table & SQL实践原理（上）
No ratings yet
第五章：Flink Table & SQL实践原理（上）
113 pages
Ai Tools
No ratings yet
Ai Tools
5 pages
JustGetIt Internship Tasks
No ratings yet
JustGetIt Internship Tasks
8 pages
1.1 Background To The Study: Chapter One: Introduction
No ratings yet
1.1 Background To The Study: Chapter One: Introduction
4 pages
A Bevy of Bifs - Editc and Editw
No ratings yet
A Bevy of Bifs - Editc and Editw
6 pages
Trace
No ratings yet
Trace
153 pages
Wellarchitected Saas Lens
No ratings yet
Wellarchitected Saas Lens
59 pages
BLII-014: Certificate in Library and Information Science (Clis)
No ratings yet
BLII-014: Certificate in Library and Information Science (Clis)
24 pages
R Programming Lab 14102024
No ratings yet
R Programming Lab 14102024
45 pages
Bless Our Thesis
No ratings yet
Bless Our Thesis
7 pages
Archive Org Details Microeconomicsth0000madd Page n11 Mode 2up
No ratings yet
Archive Org Details Microeconomicsth0000madd Page n11 Mode 2up
2 pages
complete worksheet[1]
No ratings yet
complete worksheet[1]
21 pages
Dell PowerEdge 12Gbps SAS HBA
No ratings yet
Dell PowerEdge 12Gbps SAS HBA
2 pages
P2 worksheet
No ratings yet
P2 worksheet
11 pages
รวม Supplypoint ทุกรุ่น by PTSC
No ratings yet
รวม Supplypoint ทุกรุ่น by PTSC
16 pages
Prince Resume Off
No ratings yet
Prince Resume Off
1 page
AS 1100.401-1984 Technical Drawing
No ratings yet
AS 1100.401-1984 Technical Drawing
63 pages
Bellingham Stanley Refractometer Adp 440 Plus Manual 001
No ratings yet
Bellingham Stanley Refractometer Adp 440 Plus Manual 001
87 pages
Pymodbus Readthedocs Io en Stable
No ratings yet
Pymodbus Readthedocs Io en Stable
291 pages
C#
No ratings yet
C#
2 pages
2021-11-12 15.49.18 Crash
No ratings yet
2021-11-12 15.49.18 Crash
5 pages
Canvio Flex: Manual Tools
No ratings yet
Canvio Flex: Manual Tools
2 pages
Technology Vocabulary
No ratings yet
Technology Vocabulary
9 pages

Hadoop Installation

Uploaded by

Hadoop Installation

Uploaded by

Hadoop Installation on Ubuntu (Linux)

Live USB Installation

author: Thomas Rakwach

Install the latest version of Java.

$ sudo apt install default-jdk default-jre -y

Verify the installed version of Java.

2. Create Hadoop User and Configure Password-

Add a new user hadoop.

$ sudo adduser hadoop

Add the hadoop user to the sudo group.

$ sudo usermod -aG sudo hadoop

Switch to the created user.

Install the OpenSSH server and client.

$ sudo apt install openssh-server openssh-client -y

Sham Shul Shukri Mat

keep the local version currently installed

Switch to the created user.

Generate public and private key pairs.

Add the generated public key from id_rsa.pub to authorized_keys.

$ sudo cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Change the permissions of the authorized_keys file.

$ sudo chmod 640 ~/.ssh/authorized_keys

Verify if the password-less SSH is functional.

3. Install Apache Hadoop

Log in with hadoop user.

$ sudo wget https://downloads.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

Extract the downloaded file.

$ tar -xvzf hadoop-3.3.1.tar.gz

Sham Shul Shukri Mat

$ sudo mv hadoop-3.3.1 /usr/local/hadoop

Create directory to store system logs.

$ sudo mkdir /usr/local/hadoop/logs

Change the ownership of the hadoop directory.

$ sudo chown -R hadoop:hadoop /usr/local/hadoop

Edit file ~/.bashrc to configure the Hadoop environment variables.

$ sudo nano ~/.bashrc

Activate the environment variables.

5. Configure Java Environment Variables

Find the OpenJDK directory.

Edit the hadoop-env.sh file.

$ sudo nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Browse to the hadoop lib directory.

Download the Javax activation file.

$ sudo wget https://jcenter.bintray.com/javax/activation/javax.activation-

Verify the Hadoop version.

$ sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml

Add the following lines. Save and close the file.

Sham Shul Shukri Mat

$ sudo mkdir -p /home/hadoop/hdfs/{namenode,datanode}

$ sudo chown -R hadoop:hadoop /home/hadoop/hdfs

$ sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the following lines. Close and save the file.

Edit mapred-site.xml configuration file to define MapReduce values.

$ sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the following lines. Save and close the file.

Edit the yarn-site.xml configuration file and define YARN-related settings.

$ sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Add the following lines. Save and close the file.

Sham Shul Shukri Mat

Log in with hadoop user.

Validate the Hadoop configuration and format the HDFS NameNode.

$ hdfs namenode -format

6. Start the Apache Hadoop Cluster

Start the NameNode and DataNode.

Start the YARN resource and node managers.

Verify all the running components.

7. Access Apache Hadoop Web Interface

Sham Shul Shukri Mat

You might also like