0% found this document useful (0 votes)

43 views4 pages

Bda Lab

The document outlines the installation and configuration of Hadoop, including its operation modes and file management tasks in HDFS. It also details the implementation of matrix multiplication and a word count program using Hadoop MapReduce, as well as the installation of Hive and HBase for querying and managing data. Additionally, it covers the process of importing and exporting data between databases using various tools like Sqoop, Hive, and Cassandra.

Uploaded by

kaviyaarul6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views4 pages

Bda Lab

Uploaded by

kaviyaarul6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

1.

Downloading and Installing Hadoop; Understanding Different Hadoop

Modes. Startup Scripts, Configuration Files

Aim:

To install Hadoop and understand its modes of operation, startup scripts, and configuration
files.

Description:

Hadoop can operate in three modes: standalone, pseudo-distributed, and fully distributed.
Startup scripts help manage Hadoop services, while configuration files define system
behavior.

Procedure:

1. Download Hadoop:
o Visit Apache Hadoop, download the stable version, and extract it.
2. Set up Environment Variables:
o Configure HADOOP_HOME and update PATH in .bashrc.
3. Understand Modes:
o Modify configuration files (core-site.xml, hdfs-site.xml) for pseudo or
fully distributed modes.
4. Start Services:
5. start-dfs.sh
6. start-yarn.sh

Result:

Hadoop installed and modes configured successfully.

2. Hadoop Implementation of File Management Tasks

Aim:

To perform file management tasks such as adding, retrieving, and deleting files in Hadoop
Distributed File System (HDFS).

Description:

HDFS allows distributed storage and management of files. Common tasks include adding
files to HDFS, retrieving files, and deleting them.

Procedure:

1. Start Hadoop Services:

2. start-dfs.sh
3. File Operations:
o Add file:
o hdfs dfs -put localfile.txt /hdfs_directory/
o Retrieve file:
o hdfs dfs -get /hdfs_directory/file.txt localdir/
o Delete file:
o hdfs dfs -rm /hdfs_directory/file.txt

Result:

File management tasks were successfully executed in HDFS.

3. Implementation of Matrix Multiplication with Hadoop MapReduce

Aim:

To implement matrix multiplication using Hadoop's MapReduce programming model.

Description:

Matrix multiplication is performed by splitting input matrices into key-value pairs processed
by mappers and reducers.

Procedure:

1. Write Mapper and Reducer Classes:

o Mapper processes matrix rows and columns.
o Reducer combines intermediate results to generate the final matrix.
2. Run the MapReduce Job:
3. hadoop jar MatrixMultiply.jar input output
4. View Output:
5. hdfs dfs -cat /output/*

Result:

Matrix multiplication was successfully executed.

4. Run a Basic Word Count MapReduce Program

Aim:

To implement a Word Count program using Hadoop MapReduce.

Description:

Word Count identifies the frequency of words in a text file. It demonstrates the MapReduce
paradigm of splitting, mapping, shuffling, and reducing.
Procedure:

1. Write Word Count Program:

o Mapper emits words as keys and 1 as values.
o Reducer sums the values for each key.
2. Run the Job:
3. hadoop jar WordCount.jar input output
4. View Results:
5. hdfs dfs -cat /output/*

Result:

Word frequencies were successfully calculated using MapReduce.

5. Installation of Hive Along with Practice Examples

Aim:

To install Apache Hive and practice querying structured data in Hadoop.

Description:

Hive provides a SQL-like interface to query data stored in HDFS. It simplifies querying large
datasets.

Procedure:

1. Install Hive:
o Download Hive and set HIVE_HOME environment variable.
2. Start Hive Shell:
3. hive
4. Create and Query a Table:
5. CREATE TABLE sample(id INT, name STRING);
6. LOAD DATA LOCAL INPATH 'data.txt' INTO TABLE sample;
7. SELECT * FROM sample;

Result:

Hive installed and queries executed successfully.

6. Installation of HBase and Thrift with Practice Examples

Aim:

To install HBase and Thrift to manage NoSQL data and perform basic operations.

Description:
HBase is a NoSQL database for real-time data storage, and Thrift is used for client-server
communication.

Procedure:

1. Install HBase:
o Download HBase and configure HBASE_HOME.
2. Start HBase Services:
3. start-hbase.sh
4. Perform Operations in HBase Shell:
5. create 'table1', 'cf1'
6. put 'table1', 'row1', 'cf1:col1', 'value1'
7. scan 'table1'

Result:

HBase and Thrift installed successfully, and operations executed.

7. Practice Importing and Exporting Data

Aim:

To import and export data between databases using Cassandra, Hadoop, Java, Pig, Hive, and
HBase.

Description:

Tools like Hive and Pig can move data between HDFS and other databases. Cassandra
manages NoSQL data efficiently.

Procedure:

1. Install Required Software:

o Ensure Hadoop, Hive, Pig, and Cassandra are installed.
2. Import/Export Data Using Sqoop:
o Import:
o sqoop import --connect jdbc:mysql://localhost/db --table table1
--target-dir /hdfs_dir
o Export:
o sqoop export --connect jdbc:mysql://localhost/db --table table1
--export-dir /hdfs_dir
3. Verify Data:
o Use respective database shells (Hive, Cassandra, etc.) to confirm.

Result:

Data successfully imported and exported using the mentioned tools.

Command to Check Exit Code in Linux
No ratings yet
Command to Check Exit Code in Linux
43 pages
Big Data Analytics Lab Guide
No ratings yet
Big Data Analytics Lab Guide
44 pages
BDA Practicalfile
No ratings yet
BDA Practicalfile
19 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Ad8704 BDM Manual
No ratings yet
Ad8704 BDM Manual
46 pages
MapReduce Merged
No ratings yet
MapReduce Merged
18 pages
Big Data Analytics Lab Manual 2023-24
No ratings yet
Big Data Analytics Lab Manual 2023-24
35 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
BIG Data File
No ratings yet
BIG Data File
28 pages
CCS334 Bda
No ratings yet
CCS334 Bda
23 pages
Hadoop Lab Practical Guide
No ratings yet
Hadoop Lab Practical Guide
69 pages
Ccs334-Bda Lab Manual
No ratings yet
Ccs334-Bda Lab Manual
50 pages
Hadoop Course Outline UPDATED SURESH
No ratings yet
Hadoop Course Outline UPDATED SURESH
5 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
11 To 16
No ratings yet
11 To 16
13 pages
BDA Exp (1 To 7)
No ratings yet
BDA Exp (1 To 7)
22 pages
Big Data
No ratings yet
Big Data
28 pages
Bigdata Lab
No ratings yet
Bigdata Lab
55 pages
Data Science
No ratings yet
Data Science
82 pages
Big Data Lab Manual Printout
No ratings yet
Big Data Lab Manual Printout
51 pages
Big Data Lab Guide for CS Students
No ratings yet
Big Data Lab Guide for CS Students
53 pages
Ccs334-Bda Lab Manual
No ratings yet
Ccs334-Bda Lab Manual
48 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
62 pages
BDA LabManual
No ratings yet
BDA LabManual
32 pages
1 To 8
No ratings yet
1 To 8
16 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Big Data Analytics Lab Certificate
No ratings yet
Big Data Analytics Lab Certificate
55 pages
V Ai-Ds Ccs334 Bda Labmanual
No ratings yet
V Ai-Ds Ccs334 Bda Labmanual
49 pages
Big Data File
No ratings yet
Big Data File
16 pages
Bda File
No ratings yet
Bda File
28 pages
Unstructured Data in Hadoop Analysis
No ratings yet
Unstructured Data in Hadoop Analysis
57 pages
Ccs334 Bda Lab Ex
No ratings yet
Ccs334 Bda Lab Ex
45 pages
Data Analytics Lab
No ratings yet
Data Analytics Lab
42 pages
Rush
No ratings yet
Rush
90 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
Big Data Analytics Lab Manual (BE AI&DS)
No ratings yet
Big Data Analytics Lab Manual (BE AI&DS)
29 pages
Big Data Manual
No ratings yet
Big Data Manual
82 pages
EX. NO Date Program NO Sign
No ratings yet
EX. NO Date Program NO Sign
80 pages
W Java132
No ratings yet
W Java132
14 pages
Big Data Analytics lab-JD
No ratings yet
Big Data Analytics lab-JD
49 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
27 pages
Bda Lab S
No ratings yet
Bda Lab S
92 pages
KCC Institute of Technology and Management: Big Data and Analytics Lab File BCDS651
No ratings yet
KCC Institute of Technology and Management: Big Data and Analytics Lab File BCDS651
30 pages
Big Data Manual
No ratings yet
Big Data Manual
19 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
42 pages
Bda Manual
No ratings yet
Bda Manual
33 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
59 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
BIGDATALABCURRENT
No ratings yet
BIGDATALABCURRENT
54 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
94 pages
MapReduce Programming Architecture Guide
No ratings yet
MapReduce Programming Architecture Guide
50 pages
Hadoop Single Node Cluster Setup Guide
No ratings yet
Hadoop Single Node Cluster Setup Guide
61 pages
QUIXZ
No ratings yet
QUIXZ
11 pages
Cs Quiz 205 Questions
No ratings yet
Cs Quiz 205 Questions
18 pages
Day 1 HTML
No ratings yet
Day 1 HTML
2 pages
CS3452 Theory of Computation Apr May 2023 Question Paper Download
100% (2)
CS3452 Theory of Computation Apr May 2023 Question Paper Download
3 pages
Becrs All Packages Edx File Format
No ratings yet
Becrs All Packages Edx File Format
30 pages
Unit 5 Working With Database in PHP 5.1 To 5.4
No ratings yet
Unit 5 Working With Database in PHP 5.1 To 5.4
37 pages
Data Mining Techniques (DMT) by Kushal Anjaria Session-1 (Lecture Note)
No ratings yet
Data Mining Techniques (DMT) by Kushal Anjaria Session-1 (Lecture Note)
2 pages
Statement Call in Java
No ratings yet
Statement Call in Java
17 pages
Understanding Hierarchical File Systems
No ratings yet
Understanding Hierarchical File Systems
1 page
Mensajes y Códigos de SQL
No ratings yet
Mensajes y Códigos de SQL
290 pages
Software Design Description
100% (1)
Software Design Description
26 pages
Python Data Structures Overview
No ratings yet
Python Data Structures Overview
31 pages
Clase XML Oracle
No ratings yet
Clase XML Oracle
72 pages
Java Full Stack Internship at Wipro
No ratings yet
Java Full Stack Internship at Wipro
13 pages
Understanding Second Normal Form (2NF)
No ratings yet
Understanding Second Normal Form (2NF)
44 pages
2016-12-05-Midterm - Q and Answers
No ratings yet
2016-12-05-Midterm - Q and Answers
6 pages
High School IT Automation Project
No ratings yet
High School IT Automation Project
8 pages
MySQL Subqueries, Joins, and Sets
No ratings yet
MySQL Subqueries, Joins, and Sets
10 pages
데이터큐레이션 0515
No ratings yet
데이터큐레이션 0515
10 pages
4.4 - Managed Services
No ratings yet
4.4 - Managed Services
17 pages
Database Design for IT Students
No ratings yet
Database Design for IT Students
20 pages
Flume: Data Ingestion for Hadoop
No ratings yet
Flume: Data Ingestion for Hadoop
35 pages
Data Warehousing Essentials
No ratings yet
Data Warehousing Essentials
19 pages
10 It Project
No ratings yet
10 It Project
12 pages
Unit 04 - Database Design and Development - Reworded - 2021
No ratings yet
Unit 04 - Database Design and Development - Reworded - 2021
15 pages
SQL Class 5
0% (1)
SQL Class 5
5 pages
Major Components of Data Mining System
No ratings yet
Major Components of Data Mining System
9 pages
AWS Basics for Tech Professionals
No ratings yet
AWS Basics for Tech Professionals
35 pages
MySQL Master-Slave Replication Setup Guide
No ratings yet
MySQL Master-Slave Replication Setup Guide
8 pages
Brookins A. The Temple of Django Database Performance 2020
No ratings yet
Brookins A. The Temple of Django Database Performance 2020
159 pages
Cloth Store Management
No ratings yet
Cloth Store Management
9 pages
Child Protection Van Project IMO
No ratings yet
Child Protection Van Project IMO
3 pages
MongoDB - Insert Document
No ratings yet
MongoDB - Insert Document
3 pages
Chapter 02
No ratings yet
Chapter 02
33 pages

Bda Lab

Uploaded by

Bda Lab

Uploaded by

1.

Downloading and Installing Hadoop; Understanding Different Hadoop

Hadoop installed and modes configured successfully.

2. Hadoop Implementation of File Management Tasks

1. Start Hadoop Services:

File management tasks were successfully executed in HDFS.

3. Implementation of Matrix Multiplication with Hadoop MapReduce

To implement matrix multiplication using Hadoop's MapReduce programming model.

1. Write Mapper and Reducer Classes:

Matrix multiplication was successfully executed.

4. Run a Basic Word Count MapReduce Program

To implement a Word Count program using Hadoop MapReduce.

1. Write Word Count Program:

Word frequencies were successfully calculated using MapReduce.

5. Installation of Hive Along with Practice Examples

To install Apache Hive and practice querying structured data in Hadoop.

Hive installed and queries executed successfully.

6. Installation of HBase and Thrift with Practice Examples

HBase and Thrift installed successfully, and operations executed.

7. Practice Importing and Exporting Data

1. Install Required Software:

Data successfully imported and exported using the mentioned tools.

You might also like