0% found this document useful (0 votes)
243 views

Big Data Analytics Syllabus

This document outlines the course objectives, outcomes, skills, and activities for a course on Big Data Analytics. The course aims to provide an overview of big data storage, retrieval, and processing technologies. Students will learn to use frameworks like Hadoop, Hive, and Spark to efficiently store, process, and analyze big data. They will develop MapReduce applications and learn to solve data intensive problems using Pig and Spark. Upon completing the course, students will be able to build scalable distributed systems with Hadoop, write MapReduce applications, and design applications using Hive, Pig and Spark for big data use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
243 views

Big Data Analytics Syllabus

This document outlines the course objectives, outcomes, skills, and activities for a course on Big Data Analytics. The course aims to provide an overview of big data storage, retrieval, and processing technologies. Students will learn to use frameworks like Hadoop, Hive, and Spark to efficiently store, process, and analyze big data. They will develop MapReduce applications and learn to solve data intensive problems using Pig and Spark. Upon completing the course, students will be able to build scalable distributed systems with Hadoop, write MapReduce applications, and design applications using Hive, Pig and Spark for big data use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

16IT445 BIG DATA ANALYTICS

Course Description and Objectives:


This course gives an overview of Big Data, i.e. storage, retrieval and processing of big data.
The focus will be on the “technologies”, i.e., the tools/algorithms that are available for
storage, processing of Big Data and a variety of “analytics”.
Course Outcome:
1. Understand Big Data and its analytics in the real world.
2. Use the Big Data frameworks like Hadoop and NOSQL to efficiently store and
process Big Data to generate Analytics
3 Design of Algorithms to solve Data Intensive problems using Map Reduce Paradigm.
4 Design and Implementation of Big Data Analytics using Pig and Spark to solve Data
Intensive problems and to generate analytics.
5 Analyse Big Data using Hive.

The student will be able to:


 Understand the theoretical issues involved in Big Data system design such as the
curse of dimensionality.
 Familiarize with major approaches in Big Data Analytics.

Skills:
Upon completion of this course, students will be able to do the following:
o Students will to build and maintain reliable, scalable, distributed systems with
Apache Hadoop.
o Students will be able to write Map-Reduce based Applications
o Students will be able to design and build applications using Hive and Pig
based Big data Applications
o Students will learn tips and tricks for Big Data use cases and solutions
Activities:
 Install Hadoop and develop applications on Hadoop
 Develop Map Reduce applications
 Develop applications using Hive/Pig/Spark
Unit-I

Introduction to big data: Data, Characteristics of data and Types of digital data:, Sources
of data, Working with unstructured data, Evolution and Definition of big data,
Characteristics and Need of big data, Challenges of big data

Big data analytics: Overview of business intelligence, Data science and Analytics, Meaning
and Characteristics of big data analytics, Need of big data analytics, Classification of
analytics, Challenges to big data analytics, Importance of big data analytics, Basic
terminologies in big data environment

Unit-II
Introduction to Hadoop : Introducing Hadoop, need of Hadoop, limitations of RDBMS,
RDBMS versus Hadoop, Distributed Computing Challenges, History of Hadoop , Hadoop
Overview, Use Case of Hadoop, Hadoop Distributors, HDFS (Hadoop Distributed File
System) , Processing Data with Hadoop, Managing Resources and Applications with Hadoop
YARN (Yet another Resource Negotiator), Interacting with Hadoop Ecosystem.

Unit-III

Introduction to MAPREDUCE Programming: Introduction , Mapper, Reducer, Combiner,


Partitioner , Searching, Sorting , Compression, Real time applications using MapReduce.

Unit-IV

Introduction to Hive: Introduction to Hive, Hive Architecture , Hive Data Types, Hive File
Format, Hive Query Language (HQL), User-Defined Function (UDF) in Hive.

Introduction to Pig: Introduction to Pig, The Anatomy of Pig , Pig on Hadoop , Pig
Philosophy , Use Case for Pig: ETL Processing , Pig Latin Overview , Data Types in Pig ,
Running Pig , Execution Modes of Pig, HDFS Commands, Relational Operators, Piggy
Bank , Word Count Example using Pig , Pig at Yahoo!, Pig versus Hive

Unit-V

Spark: Introduction to data analytics with Spark, Programming with RDDS, Working with
key/value pairs.

Text Books
1. Big Data Analytics, SeemaAcharya, SubhashiniChellappan, Wiley
2. Learning Spark: Lightning-Fast Big Data Analysis, Holden Karau, Andy Konwinski,
Patrick Wendell, MateiZaharia, O'Reilly Media, Inc.
Reference Books:
1. Boris lublinsky, Kevin t. Smith, AlexeyYakubovich, “Professional Hadoop
Solutions”, Wiley, ISBN: 9788126551071, 2015.
2. Chris Eaton,Dirkderooset al. , “Understanding Big data ”, McGraw Hill, 2012.
3. Tom White, “HADOOP: The definitive Guide”, O Reilly 2012.
4. VigneshPrajapati, “Big Data Analyticswith R and Haoop”, Packet Publishing 2013.

You might also like