Cloud Computing Prof

Uploaded by

niveshgarg2025

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views11 pages

Cloud Computing Prof

Uploaded by

niveshgarg2025

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Madhav Institute of Technology and Science

Deemed to be University

CLOUD COMPUTING AND VIRTUALIZATION

Name: Nivesh Garg

Submitted to: Dr. Smita Parte
Class: 6th Semester (CSE)
Enrollment number: 0901CS211078
MAP
REDUCE
Proficiency
presentation
What is MapReduce?
MapReduce is a Java-based, distributed
execution framework within the Apache
Hadoop Ecosystem. It takes away the
complexity of distributed programming by
exposing two processing steps that
developers implement: 1) Map and 2)
Reduce. In the Mapping step, data is split
between parallel processing tasks.
Transformation logic can be applied to
each chunk of data. Once completed, the
Reduce phase takes over to handle
aggregating data from the Map set.. In
general, MapReduce uses Hadoop
Distributed File System (HDFS) for both
input and output.
How does
MapReduce work?
A MapReduce system is usually
composed of three steps (even
though it's generalized as the
combination of Map and Reduce
operations/functions). The
MapReduce Operations are:
MapReduce
Map Shuffle, combine
and partition
Reduce
The input data is first split into
smaller blocks. The Hadoop Worker nodes redistribute data
A reducer cannot start while a
framework then decides how based on the output keys
mapper is still in progress.
many mappers to use, based (produced by the map function),
Worker nodes process each
such that all data belonging to
on the size of the data to be group of <key,value> pairs
one key is located on the same
processed and the memory output data, in parallel to
worker node. As an optional
block available on each produce <key,value> pairs as
process the combiner (a reducer)
mapper server. Each block is output. All the map output p
can run individually on each
then assigned to a mapper for that have the same key are
mapper server to reduce the data
processing. Each ‘worker’ node assigned to a single reducer,
on each mapper even further
applies the map function to which then aggregates the
making reducing the data
the local data, and writes the values for that key. Unlike the
footprint and shuffling and sorting
output to temporary storage. map function which is
easier. Partition (not optional) is
The primary (master) node mandatory to filter and sort the
the process that decides how the
ensures that only a single copy initial data, the reduce function
data has to be presented to the
of the redundant input data is is optional.
reducer and also assigns it to a
processed. particular reducer.
Components of
MapReduce
Architecture:
Client: The MapReduce client is the one
who brings the Job to the MapReduce
for processing. There can be multiple
clients available that continuously send
jobs for processing to the Hadoop
MapReduce Manager.

Job: The MapReduce Job is the actual

work that the client wanted to do
which is comprised of so many smaller
tasks that the client wants to process or
execute.
Hadoop MapReduce Master: It divides the
particular job into subsequent job-parts.

Job-Parts: The task or sub-jobs that are

obtained after dividing the main job. The
result of all the job-parts combined to
produce the final output.

Input Data: The data set that is fed to the

MapReduce for processing.

Output Data: The final result is obtained after

the processing.
Advantages of
MapReduce
Scalability
Flexibility
Security and
authentication
Faster processing of data
Very simple programming
model
Availability and resilient
nature
MapReduce, as a programming model and framework, offers several key
features that make it useful for processing large-scale data:

Scalability: MapReduce is designed to handle massive datasets by distributing the processing across
multiple nodes in a cluster. This distributed nature allows it to scale horizontally, accommodating
increasing data volumes without requiring significant changes to the underlying infrastructure.

Fault Tolerance: MapReduce provides built-in fault tolerance mechanisms to ensure that
computations continue in the event of node failures. It achieves this through data replication and
task re-execution, allowing jobs to recover from failures without data loss or interruption.

Parallel Processing: MapReduce divides data processing tasks into smaller units, which can be
executed independently and in parallel across multiple nodes. This parallel processing capability
enables efficient utilization of cluster resources and accelerates data processing tasks.
Ease of Programming: MapReduce abstracts away the complexities of distributed computing,
allowing developers to focus on writing simple map and reduce functions to process data. The
framework handles data distribution, task scheduling, and fault tolerance transparently, making it
easier to develop and debug distributed applications.

Data Locality: MapReduce leverages data locality to minimize data movement across the cluster. By
processing data where it resides, MapReduce reduces network overhead and improves overall
performance. This locality-aware processing is crucial for optimizing performance in distributed
environments.

Flexibility: While MapReduce's primary programming model involves the map and reduce phases, it
can be extended and customized to support various data processing tasks. Developers can define
custom input/output formats, partitioning strategies, and combiner functions to tailor MapReduce
jobs to specific requirements.
THANK YOU FOR
LISTENING!

Mapping Industrial Cybersecurity Threats To Mitre Att&Ck For Ics
100% (1)
Mapping Industrial Cybersecurity Threats To Mitre Att&Ck For Ics
15 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
5 pages
Big Data Management Continued
No ratings yet
Big Data Management Continued
48 pages
BDA unit-3
No ratings yet
BDA unit-3
63 pages
BIG DATA UNIT -3
No ratings yet
BIG DATA UNIT -3
7 pages
Bda Unit-3
No ratings yet
Bda Unit-3
20 pages
BDA UNIT-3 (1) - Merged
No ratings yet
BDA UNIT-3 (1) - Merged
98 pages
HadoopMapreduce Summerization
No ratings yet
HadoopMapreduce Summerization
24 pages
MapReduce_Unit3
No ratings yet
MapReduce_Unit3
27 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
4 pages
Unit 5
No ratings yet
Unit 5
7 pages
Unit-4-1
No ratings yet
Unit-4-1
12 pages
Big Data Analytics UNIT 3 Notets
No ratings yet
Big Data Analytics UNIT 3 Notets
12 pages
BDA Unit-2
No ratings yet
BDA Unit-2
11 pages
Hadoop: Er. Gursewak Singh Dsce
No ratings yet
Hadoop: Er. Gursewak Singh Dsce
15 pages
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
No ratings yet
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
15 pages
B. Hadoop Ecosystem_III (MapReduce)
No ratings yet
B. Hadoop Ecosystem_III (MapReduce)
55 pages
Unit 2 Topic 4 Map Reduce
No ratings yet
Unit 2 Topic 4 Map Reduce
27 pages
Hadoop (Mapreduce)
No ratings yet
Hadoop (Mapreduce)
43 pages
Unit 5 - Mapreduce
No ratings yet
Unit 5 - Mapreduce
8 pages
Map Reduce
No ratings yet
Map Reduce
25 pages
Big Data notes (1)
No ratings yet
Big Data notes (1)
13 pages
Lecture 10 MapReduce Hadoop
No ratings yet
Lecture 10 MapReduce Hadoop
37 pages
unit 2
No ratings yet
unit 2
12 pages
BIG DATA
No ratings yet
BIG DATA
120 pages
Unit IV Notes
No ratings yet
Unit IV Notes
25 pages
Chapter Five Hadoop Mapreduce & HDFS
No ratings yet
Chapter Five Hadoop Mapreduce & HDFS
44 pages
BDA_UNIT_2
No ratings yet
BDA_UNIT_2
48 pages
BDA Unit 2 Notes
No ratings yet
BDA Unit 2 Notes
32 pages
UNIT III Notes_18540760ab9652a7b4b8d9c1d0f56f3c
No ratings yet
UNIT III Notes_18540760ab9652a7b4b8d9c1d0f56f3c
24 pages
P.Prabu (23x61c) CCS334-BDA - Unit-3
No ratings yet
P.Prabu (23x61c) CCS334-BDA - Unit-3
23 pages
Unit - III Advanced Analytics Technology and Tools
No ratings yet
Unit - III Advanced Analytics Technology and Tools
44 pages
Unit-2 MapReduce2024
No ratings yet
Unit-2 MapReduce2024
41 pages
MapReduce Architecture
No ratings yet
MapReduce Architecture
5 pages
Big Data - Hadoop
No ratings yet
Big Data - Hadoop
20 pages
BDA FW-4
No ratings yet
BDA FW-4
7 pages
The Map Reduce Programming
No ratings yet
The Map Reduce Programming
15 pages
BDA-UNIT-3
No ratings yet
BDA-UNIT-3
29 pages
Unit - III
No ratings yet
Unit - III
37 pages
Big Data Analysis pdf 2
No ratings yet
Big Data Analysis pdf 2
18 pages
Lecture 10 Chapter 6 Part 1 Big Data Processing Concepts (1)
No ratings yet
Lecture 10 Chapter 6 Part 1 Big Data Processing Concepts (1)
26 pages
Cloud - UNIT V
No ratings yet
Cloud - UNIT V
18 pages
Data Science Presentation
No ratings yet
Data Science Presentation
20 pages
Unit 2 - From Hadoop Streaming PDF
No ratings yet
Unit 2 - From Hadoop Streaming PDF
20 pages
BDA 2 (1)
No ratings yet
BDA 2 (1)
35 pages
1 MapReduce introduction with example
No ratings yet
1 MapReduce introduction with example
52 pages
Data Science
No ratings yet
Data Science
7 pages
Map Reduce Report
No ratings yet
Map Reduce Report
16 pages
Big Data BCA Unit4
No ratings yet
Big Data BCA Unit4
9 pages
HDFS Unit 4
No ratings yet
HDFS Unit 4
12 pages
21CS1601 UNIT 5 UNDERSTANDING BIG DATA TECHNOLGIES
No ratings yet
21CS1601 UNIT 5 UNDERSTANDING BIG DATA TECHNOLGIES
20 pages
Hadoop Karunesh
No ratings yet
Hadoop Karunesh
14 pages
3.1.How Map Reduce Works & 3.2 Anatomy
No ratings yet
3.1.How Map Reduce Works & 3.2 Anatomy
11 pages
BDA Unit 4 PDF
No ratings yet
BDA Unit 4 PDF
31 pages
UNIT – III
No ratings yet
UNIT – III
38 pages
Introduction To Map Reduce
No ratings yet
Introduction To Map Reduce
50 pages
Parallel Project
No ratings yet
Parallel Project
32 pages
BDM 2
No ratings yet
BDM 2
5 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
9 pages
Lec 6
No ratings yet
Lec 6
16 pages
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
3-Characteristics of Database Approach-19-07-2024
No ratings yet
3-Characteristics of Database Approach-19-07-2024
5 pages
Data Analyst Interview Questions
No ratings yet
Data Analyst Interview Questions
49 pages
Emtech DLL
No ratings yet
Emtech DLL
4 pages
MJAF - Volume 8 - Issue 37 - Pages 639-656
No ratings yet
MJAF - Volume 8 - Issue 37 - Pages 639-656
18 pages
Chapter 3 Data Representation and Computer Arithmetic
No ratings yet
Chapter 3 Data Representation and Computer Arithmetic
13 pages
Advertising Models Summer Intern Report
No ratings yet
Advertising Models Summer Intern Report
6 pages
B l2vpn CG Asr9000 67x - Chapter - 0111
No ratings yet
B l2vpn CG Asr9000 67x - Chapter - 0111
52 pages
How To Merge Multiple PDF Forms Into Single One and Write in Application Serve1
No ratings yet
How To Merge Multiple PDF Forms Into Single One and Write in Application Serve1
5 pages
Paypal Cashout 2023
0% (1)
Paypal Cashout 2023
6 pages
Sipt 20 P
No ratings yet
Sipt 20 P
6 pages
Secure Login System in Assembly Language
No ratings yet
Secure Login System in Assembly Language
13 pages
Mtech Cs and Crs PCB 2024
No ratings yet
Mtech Cs and Crs PCB 2024
10 pages
Admissible Search
No ratings yet
Admissible Search
49 pages
Special Library
No ratings yet
Special Library
22 pages
Gdm1602b-Fl-Ybs Datasheet en
No ratings yet
Gdm1602b-Fl-Ybs Datasheet en
18 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
Tanushree Bansal - BTD Final Presentation
No ratings yet
Tanushree Bansal - BTD Final Presentation
34 pages
Block 4 PDF
No ratings yet
Block 4 PDF
98 pages
Fibre Channel Over Ethernet Initialization Protocol: What You Will Learn
No ratings yet
Fibre Channel Over Ethernet Initialization Protocol: What You Will Learn
12 pages
Controller HC 800
No ratings yet
Controller HC 800
2 pages
The MODICOM 3 - 2 PCM Receiver Overview - LJ Create
No ratings yet
The MODICOM 3 - 2 PCM Receiver Overview - LJ Create
6 pages
Aadhaar - India s Big Experiment with Unique Identification (A)
No ratings yet
Aadhaar - India s Big Experiment with Unique Identification (A)
16 pages
Corel Draw
No ratings yet
Corel Draw
4 pages
TecnaiT12_manual
No ratings yet
TecnaiT12_manual
22 pages
NMAP For Pentester
No ratings yet
NMAP For Pentester
21 pages
walmart sparkathon
No ratings yet
walmart sparkathon
10 pages
03N - Top Level View of Computer Function and Interconnection
No ratings yet
03N - Top Level View of Computer Function and Interconnection
38 pages
Robotair Fullstack Internship Challenge 002
No ratings yet
Robotair Fullstack Internship Challenge 002
3 pages
Tutorial SEE-Electrical V8R2 en
No ratings yet
Tutorial SEE-Electrical V8R2 en
62 pages

Cloud Computing Prof

Uploaded by

Cloud Computing Prof

Uploaded by

Madhav Institute of Technology and Science

CLOUD COMPUTING AND VIRTUALIZATION

Name: Nivesh Garg

Job: The MapReduce Job is the actual

Job-Parts: The task or sub-jobs that are

Input Data: The data set that is fed to the

Output Data: The final result is obtained after

You might also like