Data Science Masters Program - Curriculum-Updated 2019
Data Science Masters Program - Curriculum-Updated 2019
edureka!
Discover Learning
About Edureka
Edureka is a leading e-learning platform providing live instructor-led interactive online training.
We cater to professionals and students across the globe in categories like Big Data & Hadoop,
Business Analytics, NoSQL Databases, Java & Mobile Technologies, System Engineering, Project
Management and Programming.
We have an easy and affordable learning solution that is accessible to millions of learners. With
our students spread across countries like the US, India, UK, Canada, Singapore, Australia, Middle
East, Brazil and many others, we have built a community of over 1 million learners across the
globe.
Index
1 Python Statistics for Data Science Course
2 R Statistics for Data Science Course
3 Data Science Certification Training
4 Python Certification Training for Data Science
5 Apache Spark and Scala Certification Training
6 Deep Learning with TensorFlow 2.0 Certification Training
7 Tableau Training & Certification
8 Data Science Master Program Capstone Project
edureka!
Discover Learning
Module Outline
Python Statistics for Data Science
Course
Module 1: Understanding the Data
Learning Objective:
Module Curriculum
At the end
Learning Objectives:
Topics:
Learning Objectives:
At the end of this module, you should be able to:
Topics:
• Uses of probability
• Need of probability
• Bayesian Inference
• Density Concepts
• Normal Distribution Curve
Hands-on/Demo:
• Point Estimation
• Confidence Margin
• Hypothesis Testing
• Levels of Hypothesis Testing
Hands-on/Demo
Hands-on/Demo
• Clustering Techniques
Hands-on/Demo
edureka!
Discover Learning
Course Curriculum
At the end of this Module, you should be able to understand various data types, learn various
variable types, list the uses of variable types, explain population and sample, discuss sampling
techniques, and understand data representation
Topics
Learning Objectives
At the end of this Module, you should be able to understand rules of probability, learn about
dependent and independent events, implement conditional, marginal and joint probability
using Bayes Theorem, discuss probability distribution and explain Central Limit Theorem.
Topics
• Uses of probability
• Need of probability
• Bayesian Inference
• Density Concepts
• Normal Distribution Curve
Hands-on/Demo
At the end of this module, you should be able to understand the concept of point
estimation using confidence margin, demonstrate the use of level of confidence and
confidence margin, draw meaningful inferences using margin of error and explore
hypothesis testing and its different levels
Topics
• Point Estimation
• Confidence Margin
• Hypothesis Testing
• Levels of Hypothesis Testing
Hands-on/Demo
• Calculating and generalizing point estimates using R
Learning Objectives
At the end of this module, you should be able to understand parametric and non-parametric
testing, learn various types of parametric testing and explain A/B testing
Topics
• Parametric Test
• Parametric Test Types
• Non- Parametric Test
• A/B testing
Hands-on/Demo
• Perform P test and T tests in R
Learning Objectives
At the end of this module, you should be able to understand the concept of association and
dependence, explain causation and correlation, learn the concept of covariance, discuss
Simpson’s paradox, and illustrate clustering techniques.
Topics
Hands-on/Demo
Learning Objectives
At the end of this module, you should be able to: Understand the concept of Linear Regression,
Explain Logistic Regression, Implement WOE, Differentiate between heteroscedasticity and
homoscedasticity and Learn concept of residual analysis
Topics
Hands-on/Demo
edureka!
Discover Learning
Course Curriculum
Learning Objectives
Get an introduction to Data Science in this module and see how Data Science helps to analyze
large and unstructured data with different tools.
Topics
Learning Objectives
In this module, you will learn about different statistical techniques and terminologies used in data
analysis.
Topics
Learning Objectives
Discuss the different sources available to extract data, arrange the data in structured form,
analyze the data, and represent the data in a graphical format.
Topics
Hands-on/Demo
Get an introduction to Machine Learning as part of this module. You will discuss the various
categories of Machine Learning and implement Supervised Learning Algorithms.
Topics
Learning Objectives
In this module, you should learn the Supervised Learning Techniques and the implementation of
various techniques, such as Decision Trees, Random Forest Classifier, etc.
Topics
Hands-on/Demo
Learning Objectives
Learn about Unsupervised Learning and the various types of clustering that can be used to
analyze the data.
Topics
Hands-on/Demo
Learning Objectives
In this module, you should learn about association rules and different types of Recommender
Engines.
Topics
Hands-on/Demo
Learning Objectives
Discuss Unsupervised Machine Learning Techniques and the implementation of different
algorithms, for example, TF-IDF and Cosine Similarity in this module.
Topics
• The concepts of text-mining
• Use cases
• Text Mining Algorithms
• Quantifying text
• TF-IDF
• Beyond TF-IDF
Hands-on/Demo
• Implementing Bag of Words approach in R
• Implementing Sentiment Analysis on Twitter Data using R
Learning Objectives
In this module, you should learn about Time Series data, different component of Time Series data,
Time Series modeling - Exponential Smoothing models and ARIMA model for Time Series
Forecasting.
Topics
• What is Time Series data?
• Time Series variables
• Different components of Time Series data
• Visualize the data to identify Time Series Components
• Implement ARIMA model for forecasting
• Exponential smoothing models
Hands-on/Demo
• Visualizing and formatting Time Series data
• Plotting decomposed Time Series data plot
• Applying ARIMA and ETS model for Time Series Forecasting
• Forecasting for given Time period
Learning Objectives
Get introduced to the concepts of Reinforcement learning and Deep learning in this module.
These concepts are explained with the help of Use cases. You will get to discuss Artificial Neural
Network, the building blocks for Artificial Neural Networks, and few Artificial Neural Network
terminologies.
Topics
• Reinforced Learning
• Reinforcement learning Process Flow
• Reinforced Learning Use cases
• Deep Learning
• Biological Neural Networks
• Understand Artificial Neural Networks
• Building an Artificial Neural Network
• How ANN works
• Important Terminologies of ANN’s
edureka!
Discover Learning
You will get a brief idea of what Python is and touch on the basics
Topics
• Overview of Python
• The Companies using Python
• Different Applications where Python is used
• Discuss Python Scripts on UNIX/Windows
• Values, Types, Variables
• Operands and Expressions
• Conditional Statements
• Loops
• Command Line Arguments
• Writing to the screen
Hands-on/Demo
Learning Objectives
Learn different types of sequence structures, related operations, and their usage. Also learn
diverse ways of opening, reading, and writing to files.
Topics
In this Module, you will learn how to create generic python scripts, how to address
errors/exceptions in code and finally how to extract/filter content using regex.
Topics
• Functions
• Function Parameters
• Global Variables
• Variable Scope and Returning Values
• Lambda Functions
• Object-Oriented Concepts
• Standard Libraries
• The Import Statements
• Module Search Path
• Package Installation Ways
Hands-on/Demo
• Functions - Syntax, Arguments, Keyword Arguments, Return Values
• Lambda - Features, Syntax, Options, Compared with the Functions
• Sorting - Sequences, Dictionaries, Limitations of Sorting
• Errors and Exceptions - Types of Issues, Remediation
• Packages and Module - Modules, Import Options, sys Path
Learning Objectives
This Module helps you get familiar with basics of statistics, different types of measures and
probability distributions, and the supporting libraries in Python that assist in these
operations. Also, you will learn in detail about data visualization.
Topics
• NumPy - arrays
• Operations on arrays
• Indexing slicing and iterating
• Reading and writing arrays on files
• Pandas - data structures & index operations
• Reading and Writing data from Excel/CSV formats into Pandas
• matplotlib library
• Grids, axes, plots
• Markers, colors, fonts and styling
• Types of plots - bar graphs, pie charts, histograms
• Contour plots
Hands-on/Demo
• NumPy library- Creating NumPy array, operations performed on NumPy array
• Pandas library- Creating series and dataframes, Importing and exporting data
• Matplotlib - Using Scatterplot, histogram, bar graph, pie chart to show
• information, Styling of Plot
Learning Objectives
Through this Module, you will understand in detail about Data Manipulation
Topics
• Basic Functionalities of a data object
• Merging of Data objects
• Concatenation of data objects
• Types of Joins on data objects
• Exploring a Dataset
• Analyzing a dataset
Hands-on/Demo
• Pandas Function- Ndim(), axes(), values(), head(), tail(), sum(), std(), iteritems(),
iterrows(), itertuples()
• GroupBy operations
• Aggregation
• Concatenation
• Merging
• Joining
Learning Objectives
In this module, you will learn the concept of Machine Learning and its types.
Topics
• Python Revision (NumPy, Pandas, scikit learn, matplotlib)
Hands-on/Demo
• Machine Learning Process Flow
• Machine Learning Categories
• Linear regression
• Gradient descent
• Linear Regression – Boston Dataset
Topics
• What are Classification and its use cases?
• What is Decision Tree?
• Algorithm for Decision Tree Induction
• Creating a Perfect Decision Tree
• Confusion Matrix
• What is Random Forest?
Hands-on/Demo
• Implementation of Logistic regression
• Decision tree
• Random forest
Learning Objectives
In this module, you will learn about the impact of dimensions within data. You will be taught to
perform factor analysis using PCA and compress dimensions. Also, you will be developing LDA
model.
Topics
• Introduction to Dimensionality
• Why Dimensionality Reduction
• PCA
• Factor Analysis
• Scaling dimensional model
• LDA
Hands-on/Demo
• PCA
• Scaling
Learning Objectives
In this module, you will learn Supervised Learning Techniques and their implementation, for
example, Decision Trees, Random Forest Classifier etc.
Topics
• What is Naïve Bayes?
• How Naïve Bayes works?
• Implementing Naïve Bayes Classifier
• What is Support Vector Machine?
• Illustrate how Support Vector Machine works?
• Hyperparameter Optimization
• Grid Search vs Random Search
In this module, you will learn about Unsupervised Learning and the various types of clustering
that can be used to analyze the data.
Topics
• What is Clustering & its Use Cases?
• What is K-means Clustering?
• How does K-means algorithm work?
• How to do optimal clustering
• What is C-means Clustering?
• What is Hierarchical Clustering?
• How Hierarchical Clustering works?
Hands-on/Demo
• Implementing K-means Clustering
• Implementing Hierarchical Clustering
In this module, you will learn Association rules and their extension towards recommendation
engines with Apriori algorithm.
Topics
• What are Association Rules?
• Association Rule Parameters
• Calculating Association Rule Parameters
• Recommendation Engines
• How does Recommendation Engines work?
• Collaborative Filtering
• Content-Based Filtering
Hands-on/Demo
• Apriori Algorithm
• Market Basket Analysis
In this module, you will learn about developing a smart learning algorithm such that the learning
becomes more and more accurate as time passes by. You will be able to define an optimal
solution for an agent based on agent-environment interaction.
Topics
• What is Reinforcement Learning
• Why Reinforcement Learning
• Elements of Reinforcement Learning
• Exploration vs Exploitation dilemma
• Epsilon Greedy Algorithm
• Markov Decision Process (MDP)
• Q values and V values
• Q – Learning
• α values
Hands-on/Demo
• Calculating Reward
• Discounted Reward
• Calculating Optimal quantities
• Implementing Q Learning
• Setting up an Optimal Action
In this module, you will learn about Time Series Analysis to forecast dependent variables
based on time. You will be taught different models for time series modeling such that you
analyze a real time-dependent data for forecasting.
Topics
• What is Time Series Analysis?
• Importance of TSA
• Components of TSA
• White Noise
• AR model
• MA model
• ARMA model
• ARIMA model
• Stationarity
• ACF & PACF
Hands-on/Demo
• Checking Stationarity
• Converting a non-stationary data to stationary
• Implementing Dickey-Fuller Test
• Plot ACF and PACF
• Generating the ARIMA plot
• TSA Forecasting
In this module, you will learn about selecting one model over another. Also, you will learn about
Boosting and its importance in Machine Learning. You will learn on how to convert weaker
algorithms into stronger ones.
Topics
• Cross-Validation
• AdaBoost
edureka!
Discover Learning
Learning Objectives
Understand Big Data and its components such as HDFS. You will learn about the Hadoop Cluster
Architecture and you will also get an introduction to Spark and you will get to know about the
difference between batch processing and real-time processing.
Topics
Learning Objectives
Learn the basics of Scala that are required for programming Spark applications. You will also learn
about the basic constructs of Scala such as variable types, control structures, collections such as
Array, ArrayBuffer, Map, Lists, and many more.
Topics
• What is Scala?
• Scala in other Frameworks
• Basic Scala Operations
• Control Structures in Scala
• Collections in Scala- Array
• Why Scala for Spark?
• Introduction to Scala REPL
• Variable Types in Scala
• Foreach loop, Functions and Procedures
• ArrayBuffer, Map, Tuples, Lists, and more
Hands-On
Learning Objectives
In this module, you will learn about object-oriented programming and functional programming
techniques in Scala.
Topics
• Functional Programming
• Anonymous Functions
• Getters and Setters
• Properties with only Getters
• Singletons
• Overriding Methods
• Higher Order Functions
• Class in Scala
• Custom Getters and Setters
• Auxiliary Constructor and Primary Constructor
• Extending a Class
• Traits as Interfaces
• and Layered Traits
Hands On
• OOPs Concepts
• Functional Programming
Learning Objectives
Understand Apache Spark and learn how to develop Spark applications. At the end, you will learn
how to perform data ingestion using Sqoop.
Topics
Hands On
Learning Objectives
Get an insight of Spark - RDDs and other RDD related manipulations for implementing
business logics (Transformations, Actions and Functions performed on RDD).
Topics
Hands On/Demo
Learning Objectives
In this module, you will learn about SparkSQL which is used to process structured data with SQL
queries, data-frames and datasets in Spark SQL along with different kind of SQL operations
performed on the data-frames. You will also learn about the Spark and Hive integration.
Topics
Hands On/Demo
Learning Objectives
Learn why machine learning is needed, different Machine Learning techniques/algorithms, and
SparK MLlib.
Topics
Learning Objectives
Implement various algorithms supported by MLlib such as Linear Regression, Decision Tree,
Random Forest and many more.
Topics
Hands-On
• Random Forest
Learning Objectives
Understand Kafka and its Architecture. Also, learn about Kafka Cluster, how to configure different
types of Kafka Cluster. Get introduced to Apache Flume, its architecture and how it is integrated
with Apache Kafka for event processing. At the end, learn how to ingest streaming data using
flume.
Topics
Hands-On
Learning Objectives
Work on Spark streaming which is used to build scalable fault-tolerant streaming applications.
Also, learn about DStreams and various Transformations performed on the streaming data.
You will get to know about commonly used streaming operators such as Sliding, Window
Operators, and Stateful Operators.
Topics
Learning Objectives
In this module, you will learn about the different streaming data sources such as Kafka and flume.
At the end of the module, you will be able to create a spark streaming application.
Topics
Hands-On
Learning Objectives
Work on an end-to-end Financial domain project covering all the major concepts of Spark taught
during the course.
Learning Objectives
In this module, you will be learning the key concepts of Spark GraphX programming and
operations along with different GraphX algorithms and their implementations.
edureka!
Discover Learning
Discover Learning
Learning Objectives
At the end of this module, you will be able to understand the concepts of Deep Learning and
learn how it differs from machine learning. This module will also brief you out on implementing
the concept of single-layer perceptron.
Topics
Learning Objectives
At the end of this module, you should be able to get yourself introduced with TensorFlow 2.x.
You will install and validate TensorFlow 2.x by building a Simple Neural Network to predict
handwritten digits and using Multi-Layer Perceptron to improvise the accuracy of the model.
Topics
At the end of this module, you will be able to understand how and why CNN came into
existence after MLP and learn about Convolutional Neural Network (CNN) by exploring the
theory behind how CNN is used to predict ‘X’ or ‘O’. You will also use CNN VGG-16 using
TensorFlow 2 and predict whether the given image is of a ‘cat’ or a ‘dog’ and save and load a
model’s weight.
Topics
• Data Flattening
• Fully Connected Layer
• Predicting a cat or a dog
• Saving and Loading a Model
• Face Detection using OpenCV
At the end of this module, you will be able to understand the concept and working of RCNN and
figure out the reason why it was developed in the first place. The module will cover various
important topics like Transfer Learning, RCNN, Fast RCNN, RoI Pooling, Faster RCNN, and Mask
RCNN.
Topics
• Regional-CNN
• Selective Search Algorithm
• Bounding Box Regression
• SVM in RCNN
• Pre-trained Model
• Model Accuracy
• Model Inference Time
• Model Size Comparison
• Transfer Learning
• Object Detection – Evaluation
• mAP
• IoU
• RCNN – Speed Bottleneck
• Fast R-CNN
• RoI Pooling
• Fast R-CNN – Speed Bottleneck
• Faster R-CNN
• Feature Pyramid Network (FPN)
• Regional Proposal Network (RPN)
• Mask R-CNN
Learning Objectives
At the end of this module, you should be able to understand what a Boltzmann Machine is and
how it is implemented. You will also learn about what an Autoencoder is, what are its various
types, and understand how it works.
Topics
Learning Objectives
At the end of this module, you should be able to understand what generative adversarial model
is and how it works by implementing step by step Generative Adversarial Network.
Topics
After completing this module, you should be able to distinguish between Feed Forward
Network and Recurrent neural network (RNN) and understand how RNN works. You will also
understand and learn about GRU and finally implement Sentiment Analysis using RNN and GRU.
Topics
Module 9: LSTM
Learning Objectives
After completing this module, you should be able to understand the architecture of LSTM and
the importance of gates in LSTM. You will also be able to differentiate between the types of
sequence based models and finally increase the efficiency of the model using BPTT.
Topics
edureka!
Discover Learning
Course Curriculum
Topics:
• Data Visualization
• Business Intelligence tools
• Introduction to Tableau
• Tableau Architecture
• Tableau Server Architecture
• VizQL
• Introduction to Tableau Prep
• Tableau Prep Builder User Interface
• Data Preparation techniques using Tableau Prep Builder tool
Hands-On:
Topics:
Learning Objective: Understand the importance of Visual Analytics and explore the various
charts, features and techniques used for Visualization.
Topics:
• Visual Analytics
• Basic Charts: Bar Chart, Line Chart, and Pie Chart
• Hierarchies
• Data Granularity
• Highlighting
• Sorting
• Filtering
• Grouping
• Sets
Hands-On:
Topics:
• Types of Calculations
• Built-in Functions (Number, String, Date, Logical and Aggregate)
• Operators and Syntax Conventions
• Table Calculations
• Level Of Detail (LOD) Calculations
• Using R within Tableau for Calculations
Hands-On:
Topics:
• Parameters
• Tool tips
• Trend lines
• Reference lines
• Forecasting
• Clustering
Hands-On:
• Perform Data Visualization using Trend lines, Forecasting and Clustering feature in
Tableau
• In-class Project 1- Domain: Media & Entertainment Industry
Learning Objective: Deep dive into advanced analytical scenarios, using Level Of Detail
expressions.
Topics:
Topics:
Topics:
Topics:
• Introduction to Dashboards
• The Dashboard Interface
• Dashboard Objects
• Building a Dashboard
• Dashboard Layouts and Formatting
• Interactive Dashboards with actions
• Designing Dashboards for devices
• Story Points
Hands-On:
Learning Objective: Learn effective ways of designing Dashboards with minimum time
investment.
Topics:
Learning Objective: Learn to publish data, interact, modify, and secure the published data on
Tableau Online.
Topics:
In-class Project
Learning Objective: Learn to create Tableau reports for various industrial scenarios and publish
them on Tableau Online. Learn to manage permissions and secure data using filters.
Project Statement:
You have been recruited as a freelancer for a Retail store that supplies Furniture, Office Supplies
and Technology products to customers across Europe. You have been asked to create interactive
dashboards which can be used to gain insights into the profits for orders over the years.
edureka!
Discover Learning
Learning Objectives:
The capstone project will provide you with a business case. You will need to solve this by applying
all the skills you’ve learned in the courses of the master’s program. This Capstone project will
require you to apply the following skills
Data Exploration
Data Wrangling
Data Exploration
• Data Visualization
Machine Learning
• PCA
• Logistic Regression
• Generating F1 Score Metric
• Linear SVC Classifier
• XG Boost Classifier
• AdaBoost Classifier
• MLP Classifier
• MLP Classifier with Cross Validation