0% found this document useful (0 votes)
132 views18 pages

Hadoop Installation: Sunday, December 13, 2020 1

The document outlines the steps to install Hadoop on a virtual machine (VM) running Ubuntu, including downloading the VM and Ubuntu image, creating the VM, installing Java and SSH, extracting and configuring Hadoop, editing configuration files, generating keys, starting Hadoop processes, and verifying the installation.

Uploaded by

Saurabh Kothari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views18 pages

Hadoop Installation: Sunday, December 13, 2020 1

The document outlines the steps to install Hadoop on a virtual machine (VM) running Ubuntu, including downloading the VM and Ubuntu image, creating the VM, installing Java and SSH, extracting and configuring Hadoop, editing configuration files, generating keys, starting Hadoop processes, and verifying the installation.

Uploaded by

Saurabh Kothari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Hadoop Installation

Sunday, December 13, 2020 http://etlhive.com 1


Steps

Steps
VM Setup
– Download VM + Serial Key
– Download Ubuntu Image
• http://www.ubuntu.com/download/desktop
– Install VM
– Create VM using Ubuntu Image
Hadoop Setup

Sunday, December 13, 2020 http://etlhive.com 2


VM Creation
Go to Vmware and create a new virtual machine with typical
settings

Sunday, December 13, 2020 http://etlhive.com 3


VM Creation
Select iso file once prompted

Sunday, December 13, 2020 http://etlhive.com 4


VM Creation

Sunday, December 13, 2020 http://etlhive.com 5


VM Creation

Sunday, December 13, 2020 http://etlhive.com 6


VM Creation

Sunday, December 13, 2020 http://etlhive.com 7


VM Creation
Allocate 20 GB hdd and 1 gb RAM.(Customize h/w option)

Sunday, December 13, 2020 http://etlhive.com 8


VM Creation

Sunday, December 13, 2020 http://etlhive.com 9


Login to Ubuntu
Once completed the setup,
logon to Ubuntu with the credentials created at time of setup.

Now open the terminal: Shortcut: Ctrl+Alt+

Sunday, December 13, 2020 http://etlhive.com 10


Hadoop Installation
1. Install JAVA
sudo apt-get install openjdk-6-jdk
2. Install ssh
sudo apt-get install openssh-server
3. Download hadoop
wget
http://mirror.nexcess.net/apache/hadoop/common/hadoop-1.2.1/
hadoop-1.2.1.tar.gz

4. Extract Hadoop
tar -xvf hadoop-1.2.1.tar.gz
5. Get IP address using
ifconfig
>> You get 192.168.211.128
Sunday, December 13, 2020 http://etlhive.com 11
Hadoop Installation
6. Edit /etc/hosts file
sudo gedit /etc/hosts
update IP address of local host as shown in example
192.168.211.128 localhost
7. Edit 3 config files
1.Core-site
2.mapred-site
3.hdfs-site
8. Edit core-site.xml for FS default name
sudo gedit hadoop-1.2.1/conf/core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>

Sunday, December 13, 2020 http://etlhive.com 12


Hadoop Installation
9. Edit hdfs-site.xml for Replication Factor
sudo gedit hadoop-1.2.1/conf/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.permissions</name>
<value>false</value>
</property>

Sunday, December 13, 2020 http://etlhive.com 13


Hadoop Installation
10. Edit mapred-site.xml for MapReduce daemons,

Jobtracker and Task trackers


sudo gedit hadoop-1.2.1/conf/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>

Sunday, December 13, 2020 http://etlhive.com 14


Hadoop Installation
11. Edit hadoop-env.sh
sudo gedit hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
12. Generate public,private key
ssh-keygen
13. GOTO $HOME/.ssh folder
cat id_rsa.pub >> authorized_keys

14. Edit $HOME/.bashrc


sudo gedit .bashrc
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
export HADOOP_PREFIX=/home/hr/hadoop-1.2.1
export PATH=$PATH:$HADOOP_PREFIX/bin

Sunday, December 13, 2020 http://etlhive.com 15


Hadoop Installation
9. Close the terminal and open it again

10. Format Name Node


hadoop namenode –format
11. Start DFS
    start-dfs.sh
12. Start Mapred
      start-mapred.sh
13. Run JPS (Java Process Status)
jps

Sunday, December 13, 2020 http://etlhive.com 16


Hadoop Installation
14. Check the output of JPS command
Make sure following processes are running
JobTracker
NmaNode
SecondaryNameNode
TaskTracker
Jps
DataNode

15. Verify Installation


http://localhost:50070/dfshealth.jsp
http://localhost:50060/tasktracker.jsp
http://localhost:50030/jobtracker.jsp

Sunday, December 13, 2020 http://etlhive.com 17


Recap

Thank You

Sunday, December 13, 2020 http://etlhive.com 18

You might also like