0% found this document useful (0 votes)
81 views

Data Analytics IT 404 - Mod 6: Ojus Thomas Lee CE Kidangoor

This document contains a syllabus and overview of module 6 of a data analytics course. It discusses topics like internal evaluation, course outcomes, big data tools including Apache Hadoop, HDFS, MapReduce, and how these tools are used to store, process and analyze large datasets in parallel across clusters of servers.

Uploaded by

sreelaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Data Analytics IT 404 - Mod 6: Ojus Thomas Lee CE Kidangoor

This document contains a syllabus and overview of module 6 of a data analytics course. It discusses topics like internal evaluation, course outcomes, big data tools including Apache Hadoop, HDFS, MapReduce, and how these tools are used to store, process and analyze large datasets in parallel across clusters of servers.

Uploaded by

sreelaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Data Analytics IT 404 - Mod 6

Ojus Thomas Lee


CE Kidangoor

Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 1 / 53


Syllabus

Module VI

Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 2 / 53


Internal Evaluation

Assignments - 2Nos Minimum

Assignment can be presentations, projects, problems


All assignments and presentations should be prepared in LATEX
Tests - Min 2 tests

Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 3 / 53


Course Outcome- CO

To understand the data analysis techniques


To understand the concepts behind the descriptive analytics and
predictive analytics of data
To familiarize with Big Data and its sources
To familiarize data analysis using R programming
To understand the different visualization techniques in data analysis

Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 4 / 53


Big Data Tools

Big Data is an essential part of almost every organization


To get significant results through Big Data Analytics a set of tools is
needed at each phase of data processing and analysis.
There are a few factors to be considered while opting for the set of
tools i.e.,
the size of the datasets,
pricing of the tool,
kind of analysis to be done

1
https://data-flair.training/blogs/top-big-data-tools/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 5 / 53
Big Data Tools

Apache Hadoop
Hadoop is an open-source framework from Apache and runs on
commodity hardware.
It is used to store process and analyze Big Data.
Apache Spark
Spark supports both real-time as well as batch processing.
It also supports in-memory calculations, which makes it 100 times
faster than Hadoop.
Apache Storm
Apache Storm is an open-source big data tool, distributed real-time
and fault-tolerant processing system. It efficiently processes unbounded
streams of data.
The processing speed of Storm is very high. It is easily scalable and
also fault-tolerant.

2
https://data-flair.training/blogs/top-big-data-tools/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 6 / 53
Big Data Tools
3

Apache Cassandra
Apache Cassandra is a distributed database that provides high
availability and scalability without compromising performance efficiency.
Cassandra works quite efficiently under heavy loads.
It does not follow master-slave architecture so all nodes have the same
role.
Apache Cassandra supports the ACID (Atomicity, Consistency,
Isolation, and Durability) properties.
MongoDB
MongoDB is an open-source data analytics tool, NoSQL database that
provides cross-platform capabilities.
Apache Flink
Apache Flink is an Open-source data analytics tool distributed
processing framework for bounded and unbounded data streams.

3
https://data-flair.training/blogs/top-big-data-tools/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 7 / 53
Big Data Tools

Kafka
Apache Kafka is an open-source platform that was created by LinkedIn
in the year 2011.
Apache Kafka is a distributed event processing or streaming platform
which provides high throughput to the systems.
It can handle trillions of events a day.
It is highly scalable and also provides great fault tolerance.
R Programming
R is an open-source programming language and is one of the most
comprehensive statistical analysis languages.
helps in generating the results of data analysis in graphical as well as
text format.

4
https://data-flair.training/blogs/top-big-data-tools/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 8 / 53
Big Data Tools

Apache Hadoop

Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 9 / 53


Big Data Tools

HDFS
HDFS is a Filesystem of Hadoop designed for storing very large files
running on a cluster of commodity hardware.
It is designed on the principle of storage of less number of large files
rather than the huge number of small files.
HDFS Nodes
Name nodes
Data Node

5
https://data-flair.training/blogs/top-big-data-tools/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 10 / 53
Hadoop
6

Hadoop Distributed File System (HDFS)

6
https://data-flair.training/blogs/top-big-data-tools/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 11 / 53
Hadoop - HDFS

Name nodes
NameNode works as a Master in a Hadoop cluster that guides the
Datanode(Slaves).
Namenode is mainly used for storing the Metadata i.e. the data about
the data.
Meta Data can be the transaction logs that keep track of the user’s
activity in a Hadoop cluster.
Meta Data can also be the name of the file, size, and the information
about the location(Block number, Block ids) of Datanode that
Namenode stores to find the closest DataNode for Faster
Communication.
Namenode instructs the DataNodes with the operation like Delete,
Create, Replicate, etc.

7
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 12 / 53
Hadoop - HDFS

Name nodes
DataNodes works as a Slave DataNodes are mainly utilized for storing
the data in a Hadoop cluster,
the number of DataNodes can be from 1 to 500 or even more than
that.
The more number of DataNode, the Hadoop cluster will be able to
store more data.
So it is advised that the DataNode should have high storing capacity to
store a large number of file blocks.

8
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 13 / 53
Hadoop - HDFS
9

File Block In HDFS:


Data in HDFS is always stored in terms of blocks.
So the single block of data is divided into multiple blocks of size
128MB which is default and you can also change it manually.

9
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 14 / 53
Hadoop - HDFS

10

Replication In HDFS
Replication ensures the availability of the data.
Replication is making a copy of something and the number of times
you make a copy of that particular thing can be expressed as it’s
Replication Factor.
As we have seen in File blocks that the HDFS stores the data in the
form of various blocks at the same time Hadoop is also configured to
make a copy of those file blocks.

10
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 15 / 53
Hadoop - HDFS

11

Rack Awareness
The rack is nothing but just the physical collection of nodes in our
Hadoop cluster (maybe 30 to 40).
A large Hadoop cluster consists of so many Racks,
With the help of this Racks information Namenode chooses the closest
Datanode to achieve the maximum performance while performing the
read/write information which reduces the Network Traffic.

11
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 16 / 53
Hadoop - HDFS

12

12
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 17 / 53
Hadoop - MapReduce

13

Map Reduce
MapReduce nothing but just like an Algorithm or a data structure.
The major feature of MapReduce is to perform the distributed
processing in parallel in a Hadoop cluster which Makes Hadoop working
so fast.
When you are dealing with Big Data, serial processing is no more of
any use.
MapReduce has mainly 2 tasks which are divided phase-wise:
Map Task
Reduce Task

13
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 18 / 53
Hadoop - MapReduce

14

Map Reduce

14
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 19 / 53
Hadoop - MapReduce

15

Map Reduce - Map Task:


RecordReader :- The purpose of recoredreader is to break the records.
It is responsible for providing key-value pairs in a Map() function.
Map : A map is nothing but a user-defined function whose work is to
process the Tuples obtained from record reader.

15
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 20 / 53
Hadoop - MapReduce

16

Map Reduce - Map Task:


Combiner: Combiner is used for grouping the data in the Map
workflow. It is similar to a Local reducer. The intermediate key-value
that are generated in the Map is combined with the help of this
combiner.
Partitionar : Partitional is responsible for fetching key-value pairs
generated in the Mapper Phases. The partitioner generates the shards
corresponding to each reducer.

16
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 21 / 53
Hadoop - MapReduce

17

Map Reduce - Reduce Task:


Shuffle and Sort: The process in which the Mapper generates the
intermediate key-value and transfers them to the Reducer task is
known as Shuffling. Using the Shuffling process the system can sort
the data using its key value.
Reduce: The main function or task of the Reduce is to gather the
Tuple generated from Map and then perform some sorting and
aggregation sort of process on those key-value depending on its key
element.
OutputFormat: Once all the operations are performed, the key-value
pairs are written into the file with the help of record writer, each record
in a new line, and the key and value in a space-separated manner.

17
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 22 / 53
Hadoop - MapReduce

18

Map Reduce Workflow

18
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 23 / 53
Hadoop - MapReduce
19

Map Reduce Workflow

19
https://www.geeksforgeeks.org/hadoop-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 24 / 53
Hadoop - MapReduce
20

Map Reduce Example

20
https://www.edureka.co/blog/mapreduce-tutorial/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 25 / 53
Hadoop - MapReduce

21

Developing and Executing a Hadoop MapReduce Program


Develop a Hadoop MapReduce program by writing Java code using an
Interactive Development Environment (IDE) tool such as Eclipse
A typical MapReduce program consists of three Java files: one each for
the driver code,
map code, and
reduce code.
The Java code is compiled and stored as a Java Archive (JAR) file.
The JAR file created is then executed against the specified HDFS input
files.

21
https://www.edureka.co/blog/mapreduce-tutorial/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 26 / 53
Hadoop - MapReduce

22

Developing and Executing a Hadoop MapReduce Program


For users who prefer to use a programming language other than Java,
there are some other options.
One option is to use the Hadoop Streaming API, which allows the user
to write and run Hadoop jobs with no direct knowledge of Java.
User can use programming languages such as Python, C, or Ruby.

22
https://www.edureka.co/blog/mapreduce-tutorial/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 27 / 53
Hadoop - MapReduce

23

Developing and Executing a Hadoop MapReduce Program


A second alternative is to use Hadoop pipes,
The mechanism that uses compiled C++ code for the map and
reduced functionality.

23
https://www.edureka.co/blog/mapreduce-tutorial/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 28 / 53
Hadoop - YARN

24

YARN Stands for Yet Another Resource Negotiator.


YARN is responsible for Resource management part.
Yarn execution model is more generic as compare to Map reduce
YARN can execute those applications as well which dont́ follow Map
Reduce model
In the place of job tracker and task tracker Application, the master
comes into the picture.
YARN is more isolated and scalable
YARN has Name Node, Data node, secondary Name node, Resource
Manager and Node Manager.

24
https://www.educba.com/mapreduce-vs-yarn/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 29 / 53
Hadoop - YARN

25

YARN allows the data stored in HDFS (Hadoop Distributed File


System) to be processed and run by various data processing engines
such as
Batch processing,
Stream processing,
Interactive processing,
Graph processing and many more.

25
https://www.educba.com/mapreduce-vs-yarn/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 30 / 53
Hadoop - MapReduce
26

Hadoop Yarn

HadoopYarn
26
https://www.edureka.co/blog/mapreduce-tutorial/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 31 / 53
Hadoop - MapReduce
27

Hadoop Yarn

HadoopYarn
27
https://www.edureka.co/blog/mapreduce-tutorial/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 32 / 53
Hadoop - HBASE

28

HBASE
Hbase is an open source and sorted map data built on Hadoop.
It is column oriented and horizontally scalable.
It is based on Google’s Big Table.
It has set of tables which keep data in key value format.
It is a part of the Hadoop ecosystem that provides random real-time
read/write access to data in the Hadoop File System.

28
https://www.javatpoint.com/what-is-hbase
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 33 / 53
Hadoop - HBASE

29

Why HBase?
RDBMS get exponentially slow as the data becomes large
Expects data to be highly structured, i.e. ability to fit in a well-defined
schema
Any change in schema might require a downtime
For sparse datasets, too much of overhead of maintaining NULL values

29
https://www.javatpoint.com/what-is-hbase
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 34 / 53
Hadoop - HBASE

30

HBase Features
Apache HBase has a completely distributed architecture.
It can easily work on extremely large scale data.
HBase offers high security and easy management which results in
unprecedented high write throughput.
For both structured and semi-structured data types we can use it.
Moreover, the MapReduce jobs can be backed with HBase Tables.

30
https://www.javatpoint.com/what-is-hbase
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 35 / 53
Hadoop - HBASE
31

31
https://www.tutorialspoint.com/hbase/hbase overview.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 36 / 53
Hadoop - HBASE

32

HBase Architecture
HBase Architecture is basically a column-oriented key-value data store
and
It is the natural fit for deploying as a top layer on HDFS because it
works extremely fine with the kind of data that Hadoop process.

32
https://www.javatpoint.com/what-is-hbase
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 37 / 53
33
Hadoop - HBASE

33
https://www.tutorialspoint.com/hbase/hbase overview.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 38 / 53
Hadoop - HBASE

34

In HBase, tables are split into regions and are served by the region
servers.
Regions are vertically divided by column families into “Stores”.
Stores are saved as files in HDFS.
HBase has three major components
The client library,
A master Server, and
Region Servers.
Region servers can be added or removed as per requirement.

34
https://www.tutorialspoint.com/hbase/hbase overview.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 39 / 53
Hadoop - HBASE

35

The master server -


Assigns regions to the region servers and takes the help of Apache
ZooKeeper for this task.
Handles load balancing of the regions across region servers. It unloads
the busy servers and shifts the regions to less occupied servers.
Maintains the state of the cluster by negotiating the load balancing.
Is responsible for schema changes and other metadata operations such
as creation of tables and column families.

35
https://www.tutorialspoint.com/hbase/hbase overview.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 40 / 53
36
Hadoop - HBASE

Regions are nothing but tables that are split up and spread across the
region servers.
The region servers have regions that -
Communicate with the client and handle data-related operations.
Handle read and write requests for all the regions under it.
Decide the size of the region by following the region size thresholds.

36
https://www.tutorialspoint.com/hbase/hbase overview.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 41 / 53
37
Hadoop - HBASE

Inside Region Server

37
https://www.tutorialspoint.com/hbase/hbase overview.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 42 / 53
38
Hadoop - HBASE

Inside Region Server


The store contains memory store and HFiles.
Memstore is just like a cache memory.
Anything that is entered into the HBase is stored here initially.
Later, the data is transferred and saved in Hfiles as blocks and the
memstore is flushed.

38
https://www.tutorialspoint.com/hbase/hbase overview.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 43 / 53
39
Hadoop - PIG

Pig is a high-level data flow platform for executing Map Reduce


programs of Hadoop.
It was developed by Yahoo.
The language for Pig is pig Latin.

39
https://www.javatpoint.com/pig
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 44 / 53
40
Hadoop - PIG

The Pig scripts get internally converted to Map Reduce jobs and get
executed on data stored in HDFS.
Pig can handle any type of data, i.e., structured, semi-structured or
unstructured and stores the corresponding results into Hadoop Data
File System.
Every task which can be achieved using PIG can also be achieved
using java used in MapReduce.

40
https://www.javatpoint.com/pig
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 45 / 53
41
Hadoop - PIG

Architecture of PIG

41
https://data-flair.training/blogs/pig-architecture/
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 46 / 53
42
Hadoop - HIVE

Hive is a data warehouse system which is used to analyze structured


data. It is built on the top of Hadoop. It was developed by Facebook.
Hive provides the functionality of reading, writing, and managing
large datasets residing in distributed storage.
It runs SQL like queries called HQL (Hive query language) which gets
internally converted to MapReduce jobs.
Using Hive, we can skip the requirement of the traditional approach
of writing complex MapReduce programs.
Hive supports Data Definition Language (DDL), Data Manipulation
Language (DML), and User Defined Functions (UDF).

42
https://www.javatpoint.com/what-is-hive
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 47 / 53
43
Hadoop - HIVE

Features of Hive
Hive is fast and scalable.
It provides SQL-like queries (i.e., HQL) that are implicitly transformed
to MapReduce or Spark jobs.
It is capable of analyzing large datasets stored in HDFS.
It allows different storage types such as plain text, RCFile, and HBase.
It uses indexing to accelerate queries.
It can operate on compressed data stored in the Hadoop ecosystem.

43
https://www.javatpoint.com/what-is-hive
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 48 / 53
44
Hadoop - HIVE

PIG and HIVE Comparison

44
https://www.javatpoint.com/what-is-hive
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 49 / 53
45
Hadoop - HIVE

HIVE Architecture

45
https://www.javatpoint.com/what-is-hive
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 50 / 53
46
Hadoop - MAHOUT

Apache Mahout is an open source project that is primarily used for


creating scalable machine learning algorithms.
It implements popular machine learning techniques such as:
Recommendation
Classification
Clustering

46
https://www.tutorialspoint.com/mahout/mahout introduction.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 51 / 53
47
Hadoop - MAHOUT

Features of Mahout
The algorithms of Mahout are written on top of Hadoop, so it works
well in distributed environment.
Mahout uses the Apache Hadoop library to scale effectively in the
cloud.
Mahout offers the coder a ready-to-use framework for doing data
mining tasks on large volumes of data.
Mahout lets applications to analyze large sets of data effectively and in
quick time.
Includes several MapReduce enabled clustering implementations such
as k-means, fuzzy k-means, Canopy, Dirichlet, and Mean-Shift.
Supports Distributed Naive Bayes and Complementary Naive Bayes
classification implementations.

47
https://www.tutorialspoint.com/mahout/mahout introduction.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 52 / 53
48
Hadoop - MAHOUT

Mahout - Architecture

48
https://www.tutorialspoint.com/mahout/mahout introduction.htm
Ojus Thomas Lee CE Kidangoor Data Analytics IT 404 - Mod 6 53 / 53

You might also like