Big Data Analytics Syllabus
Big Data Analytics Syllabus
Skills:
Upon completion of this course, students will be able to do the following:
o Students will to build and maintain reliable, scalable, distributed systems with
Apache Hadoop.
o Students will be able to write Map-Reduce based Applications
o Students will be able to design and build applications using Hive and Pig
based Big data Applications
o Students will learn tips and tricks for Big Data use cases and solutions
Activities:
Install Hadoop and develop applications on Hadoop
Develop Map Reduce applications
Develop applications using Hive/Pig/Spark
Unit-I
Introduction to big data: Data, Characteristics of data and Types of digital data:, Sources
of data, Working with unstructured data, Evolution and Definition of big data,
Characteristics and Need of big data, Challenges of big data
Big data analytics: Overview of business intelligence, Data science and Analytics, Meaning
and Characteristics of big data analytics, Need of big data analytics, Classification of
analytics, Challenges to big data analytics, Importance of big data analytics, Basic
terminologies in big data environment
Unit-II
Introduction to Hadoop : Introducing Hadoop, need of Hadoop, limitations of RDBMS,
RDBMS versus Hadoop, Distributed Computing Challenges, History of Hadoop , Hadoop
Overview, Use Case of Hadoop, Hadoop Distributors, HDFS (Hadoop Distributed File
System) , Processing Data with Hadoop, Managing Resources and Applications with Hadoop
YARN (Yet another Resource Negotiator), Interacting with Hadoop Ecosystem.
Unit-III
Unit-IV
Introduction to Hive: Introduction to Hive, Hive Architecture , Hive Data Types, Hive File
Format, Hive Query Language (HQL), User-Defined Function (UDF) in Hive.
Introduction to Pig: Introduction to Pig, The Anatomy of Pig , Pig on Hadoop , Pig
Philosophy , Use Case for Pig: ETL Processing , Pig Latin Overview , Data Types in Pig ,
Running Pig , Execution Modes of Pig, HDFS Commands, Relational Operators, Piggy
Bank , Word Count Example using Pig , Pig at Yahoo!, Pig versus Hive
Unit-V
Spark: Introduction to data analytics with Spark, Programming with RDDS, Working with
key/value pairs.
Text Books
1. Big Data Analytics, SeemaAcharya, SubhashiniChellappan, Wiley
2. Learning Spark: Lightning-Fast Big Data Analysis, Holden Karau, Andy Konwinski,
Patrick Wendell, MateiZaharia, O'Reilly Media, Inc.
Reference Books:
1. Boris lublinsky, Kevin t. Smith, AlexeyYakubovich, “Professional Hadoop
Solutions”, Wiley, ISBN: 9788126551071, 2015.
2. Chris Eaton,Dirkderooset al. , “Understanding Big data ”, McGraw Hill, 2012.
3. Tom White, “HADOOP: The definitive Guide”, O Reilly 2012.
4. VigneshPrajapati, “Big Data Analyticswith R and Haoop”, Packet Publishing 2013.