0% found this document useful (0 votes)
41 views19 pages

Academy Program PDF

Uploaded by

Mike
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views19 pages

Academy Program PDF

Uploaded by

Mike
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

learn data science by building

20
WORKSHOPS
620
PARTICIPANTS
20+
CORPORATE CLIENTS
US
WHY US?
Curriculum tailored to the needs of industries

Small cohort sizes allow more personal interaction


with teaching assistants

Final projects are directly applicable to the industry


SPECIALIZATION DETAILS :
A fun, hands-on, and project-based
learn data science by building
specialization that helps student gain full
proficiency in data visualization systems
and tools. Create compelling narratives by
combining charting elements with custom
aesthetics under the guidance of our
instructors.

The learn-by-building module in all the


workshops follows our project-based
learning philosophy to this specialization.
The course capstone requires that the
student build a real-world application
under stringent criteria modeled after
real business scenarios.
Programming for Data Science
is a course that cover the important Upon completion of this workshop, you will be
programming paradigms and tools used by familiar with the programming language,
data analysts and data scientists today. You popular tools, libraries (data science packages)
will be guided through a series of coding and tool kits required to excel in your data
exercises designed to maximize your analysis and statistical computing projects.
familiarity with data science programming in
RStudio, an integrated development
environment for the statistical computing
language R.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Data Science Toolkit Data Manipulation
Module 1: Retail Sales
Data Science in R Data Manipulation Pre-Diagnostic Cleanup
R Programming Basics Getting Familiar with your Workspace A programming script that reads data into our
Data Structures in R R Scripts and Markdown workspace, perform various data cleansing tasks,
R Studio Interface Continuous and Categorical Variables and save the appropriate formats for data science work.

Data Science in Python Data Manipulation II Module 2: Reproducible Data Science


Introduction to Python Vector Types Create an R Markdown file that combines data
Jupyter Notebook Interface List and Objects transformation code with explanatory text. Add
Data Science Toolkit Matrix and Dataframes formatting styles and hierarchical structure
using Markdown.
Working with Data Practical Data Cleansing
Understanding Statistics The Data Transformation Process
Reading & Extracting Data Reproducible Data Science Projects
Exploratory Data Analysis Reading and Writing from your IDE
Practical Statistics
Pave the statistical foundation for more The 2-day course is optional for participations
advanced machine learning theories later on of the Data Science and Machine Learning
in the specialization by picking up the key Specialization and intended for learners
ideas in statistical thinking. Learn to interpret without prior experience in statistics.
correlations, construct confidence intervals
and other statistical principles that are the
basis for regression analysis.

2-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Descriptive Statistics Inferential Statistics
Module 1: Exploratory Data Analysis
5-Number Summary Probabilities Write a reproducible data analysis applying what
Mean, Median, Mode Probability Mass Function you’ve learned in the workshop. The analysis should
Understanding Quantiles Probability Density Function contain at least 3 statistical plots, and a summary
Quantiles in R Expected Values paragraph that contains your early findings / points
5-Number Summary in R of interest from the given dataset.
Intervals
Central Tendency Confidence Intervals
& Variability Prediction Intervals
Probability Distribution Function Hypothesis Testing
Visualizing Central Tendency p-values
Understanding Variation
Covariance and Variance Inferential Statistics
Standard Score & z-score in Practice
Deriving Scientific Truths from Data
Standard Normal Curve
Making Informed Decisions
Central Limit Theorem
Case Studies
z-score Calculation in R
z-score and Student’s t-test
Data Visualization in R
A fun, hands-on, and project-based workshop Students are tasked to reproduce a series of
that helps student gain full proficiency in data plots applying what they’ve learned. While it
visualization systems and tools. Create covers the three main plotting systems in R,
compelling narratives by combining charting its particular focus is on ggPlot2 and the
elements with custom aesthetics under the additional libraries centered around it that
guidance of our instructors. bring interactivity and enhanced aesthetic
options to the art of creating rich, powerful
visualizations.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Plotting Essentials Richer Visualization
Module 1: Creating a
Built-in Plotting Techniques Publication-Grade Plot
Functionalities Simple Interactivity Applying what you’ve learned, create an economics-
Plots and Lines Using Manipulate or social-related plot that is polished with the
Built-in Plot Types ggiraph appropriate annotations, aesthetics and some
Histograms and Curves HTML5 Widgets simple commentary.
Axis, Title, and Panel Styles
Visualizing Module 2: Creating an Interactive Map
ggPlot Plotting System Applying what you’ve learned, create a web page with
Geo-Spatial Data an interactive map embedded on it. Use a custom icon
Grammar of Graphics System Dealing with Spatial Dataframes
Mapping Aesthetics for the map markers to represent business locations,
Using Leaflet and show details about each location pin (”markers”)
Understanding Geometries Using tmap
Axis, Title, and Scales upon user’s interaction with it.

Visualization Toolset
Enhancing ggPlot Lattice Plotting System
Adding Themes to ggPlot Using Plotly
Custom Aesthetics and Styles Prettier Pairs Matrix
Multi-dimensional Faceting Prettier Heatmap
Text Layers and Custom Text
Interactive Plotting & Web Dashboard
Building on the foundation from previous The 3-day course follows our
classes, we will create a series of interactive learn-by-building approach, in that students
plots and gadgets that renders multiple are tasked to reproduce a series of plots
visualization elements based on user’s input. applying what they’ve learned. It covers an
This is the final workshop leading up to the exhaustive list of techniques that add
data visualization capstone project. interactivity to an R document and set the
stage for the data science capstone project.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Deep Dive Shiny Building Dynamic
Module 1: Building an
Shiny Essentials Dashboards Interactive Dashboard
Interactive Documents Applying what you’ve learned, create a paginated
Flexdashboard
Working with Gadgets web dashboard with a rich set of UI elements coupled
Layouts and Templates
Working with miniUI with the appropriate server logic. The web dashboard
Storyboard
Interactive Documents in Action can be of any tgeme, using any dataset, but must
Adding Custom Styles
feature an input panel that accepts end user inputs
Standalone App and render the output accordingly.
Shiny App Formats Shiny Dashboard
UI Components Dashboard Structure
Server Components Adding Custom Styles
App Deployment Solutions Working with Twitter Bootstrap

Server Logic Building a Dynamic


Data Binding Dashboard
Reactivity Working with Live Data
Performance Consideration App Deployment Solutions
Deployment d3, Leaflet, and Google Visualization
SPECIALIZATION DETAILS :
An intensive specialization that strives for
learn data science by building
a fine balance between practical applications
and mathematical rigor in teaching essential
machine learning concepts. By taking a learn-
by-building approach, you will learn to develop
regression and classification algorithms and
incorporate them into real-life solutions or
data products / business applications.

The modules in all the workshops follow our


project-based learning philosophy to this
specialization. The course capstone requires
that the student build a real-world application
under stringent criteria modeled after real
business scenarios.
Programming for Data Science
is a course that cover the important Upon completion of this workshop, you will be
programming paradigms and tools used by familiar with the programming language,
data analysts and data scientists today. You popular tools, libraries (data science packages)
will be guided through a series of coding and tool kits required to excel in your data
exercises designed to maximize your analysis and statistical computing projects.
familiarity with data science programming in
RStudio, an integrated development
environment for the statistical computing
language R.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Data Science Toolkit Data Manipulation
Module 1: Retail Sales
Data Science in R Data Manipulation Pre-Diagnostic Cleanup
R Programming Basics Getting Familiar with your Workspace A programming script that reads data into our
Data Structures in R R Scripts and Markdown workspace, perform various data cleansing tasks,
R Studio Interface Continuous and Categorical Variables and save the appropriate formats for data science work.

Data Science in Python Data Manipulation II Module 2: Reproducible Data Science


Introduction to Python Vector Types Create an R Markdown file that combines data
Jupyter Notebook Interface List and Objects transformation code with explanatory text. Add
Data Science Toolkit Matrix and Dataframes formatting styles and hierarchical structure
using Markdown.
Working with Data Practical Data Cleansing
Understanding Statistics The Data Transformation Process
Reading & Extracting Data Reproducible Data Science Projects
Exploratory Data Analysis Reading and Writing from your IDE
Practical Statistics
Pave the statistical foundation for more The 2-day course is optional for participations
advanced machine learning theories later on of the Data Science and Machine Learning
in the specialization by picking up the key Specialization and intended for learners
ideas in statistical thinking. Learn to interpret without prior experience in statistics.
correlations, construct confidence intervals
and other statistical principles that are the
basis for regression analysis.

2-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Descriptive Statistics Inferential Statistics
Module 1: Exploratory Data Analysis
5-Number Summary Probabilities Write a reproducible data analysis applying what
Mean, Median, Mode Probability Mass Function you’ve learned in the workshop. The analysis should
Understanding Quantiles Probability Density Function contain at least 3 statistical plots, and a summary
Quantiles in R Expected Values paragraph that contains your early findings / points
5-Number Summary in R of interest from the given dataset.
Intervals
Central Tendency Confidence Intervals
& Variability Prediction Intervals
Probability Distribution Function Hypothesis Testing
Visualizing Central Tendency p-values
Understanding Variation
Covariance and Variance Inferential Statistics
Standard Score & z-score in Practice
Deriving Scientific Truths from Data
Standard Normal Curve
Making Informed Decisions
Central Limit Theorem
Case Studies
z-score Calculation in R
z-score and Student’s t-test
Regression Models
This course strives for a fine balance between We strongly recommend that you complete
business applications and mathematical rigor practical statistics prior to taking this course.
in its treatment to regression models, one of Upon completion of this workshop, you will
the most essential statistical techniques in the acquire a rigorous statistical understanding of
field of machine learning. Its aim is to equip machine learning models, allowing you to
you with the knowledge to investigate extrapolate the same ideas into other, more
relationships between variables of a data advanced machine learning models.
effectively and rigorously.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Linear Models In-Depth Regression Module 1: Lowering Crime Rates
Simple Linear Models Write a regression analysis report applying what
you’ve learned in the workshop. Using the dataset
Regression Non-linear provided by you, write your findings on the different
Intercept and Slope
Regression Models socioeconomic variable most highly correlated to
Understanding Coefficients crime rates, and quantify the relations between
Polynomial Terms
Estimating Coefficients education level and violent crimes level in a city.
Adding Interaction Terms
Assumptions of Model Interpretation Explain your recommendations where appropriate.
Linear Models Model Diagnostics
Linearity Assumption Ordinary Least Squares
Relations to Correlation Plotting Residuals
Normality Assumption Residuals Calculation (manual)
z-Score

Interpretation Model Diagnostics II


Interpreting models in R R-Squared
Business Application I Heteroskedasticity
Business Application II Box-Cox Transformation
Classification in Machine Learning 1
Learn to solve binary and multi-class We strongly recommend that you complete
classification models using machine learning the regression models workshop prior to
algorithms that is easily understood and taking this course. Upon completion of this
readily interpretable. You will learn to write a workshop, you will acquire the depth to
classification algorithm from scratch, and develop, apply, and evaluate two highly
appreciate the mathematical foundations versatile algorithms widely used today.
underpinning logistic regressions and nearest
neighbors algorithms.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Logistic Regression Nearest Neighbours Module 1: Business Risk Analysis
Relating Probabilities
Prediction Applying what you’ve learned, present a simple
analysis and identify how the various variables can
to Odds k-NN as a Classifier impact the risk of a business. Demonstrate how
Understanding Odds Distance Function per-unit increment of a variable can lead to a change
Log of Odds k-NN Intuition in odds and use basic plots to support your
Sigmoid Curve Choosing k demonstration as necessary.
Logistic Regression Model Improvement
Bias-Variance Tradeoff
Module 2: Predicting
from Scratch
Prior Probabilities Normalization and Scaling Customer Segments
Cross Validation Develop a k-NN algorithm to sort B2B customers
Exponents and Logarithms
into one of 2 possible segments. Your algorithm
Interpreting Logistic Regression
Model Evaluation must take the provided data as input, and achieve
Practical Application Area Under Curve an accuracy of at least 90%. You may use feature
Using Logistic Regression: Precision-Recall Tradeoff engineering, rescaling, feature selection or other
Finance Parameterization preprocessing techniques you’ve learned.
Using Logistic Regression:
General Business
Hauck-Donner Effect
Classification in Machine Learning 2
Learn to apply the law of probabilities, We strongly recommend that you complete
boosting, bootstrap aggregation, k-fold cross the Machine Learning: Classification 1
validation, ensembling methods, and a variety workshop prior to taking this course. Some
of other techniques as we build some of the concepts presented throughout the lecture
most widely used machine learning may be less-than-ideal for practitioners who
algorithms today. Learn to add performance have not completed the pre-requisite courses.
to your models using mathematically sound
principles you’ll learn in this course.

Workshop Module 1: 3-Day Workshop Modules


Classification
Algorithms Workshop Module 2: Learn-by-Building Modules
Working with Probabilities High Performance Module 1: Spam Filter
Bayes Theorem Modules Applying what you’ve learned, build a spam
Naive Bayes classification model using the appropriate text
Laplace Smoothing Ensemble-based Methods mining measures as necessary and the Naive Bayes
Relation to Logistic Regression Intuition: Why Ensemble works algorithm. Point out the weakness and limitations of
Model Blending Examples your spam filter as well as the strengths.
Tree-based Models
Decision Tree Intuition Relation to Random Forest
Pre-pruning and Post-pruning
Module 2: Random Forest Classification
Splitting Criteria Random Forest Using random forest and the appropriate
Bootstrap Aggregation in Practice pre-processing steps, build a classifier that takes as
C.50 Algorithm
Automatic Feature Selection input a high-dimensional training data (say, more
Practical Application k-Fold Cross Validation than 150+ variables) and observe the automatic
Text Classification with variable selection feature. Explain the random forest,
Naive Bayes High-Performance including why its out-of-bag (oob) error rate is a
Laplace Smoother in Practice Machine Learning reliable, unbiased estimate of our model’s
Decision Tree for interpretable Boosting (Weak Learners) performance on unseen data.
model Competitive Machine Learning
Techniques to improve model’s Parallel Computing with R
accuracy
Unsupervised Machine Learning
Learn PCA (Principal Component Analysis), We strongly recommend that you complete
Clustering, and other algorithms to work with the pre-requisite courses prior to taking this
unsupervised machine learning tasks where course. Some concepts presented throughout
the target variable is not known or defined. the lecture may be less-than-ideal for
Applying what you’ll learn from this practitioners who are new to the field of
workshop, you will be tasked to develop an machine learning.
anomaly detection or an e-commerce product
recommendation model that can be related to
real-life business scenarios.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Unsupervised Unsupervised
Module 1: You May Also Like...
Learning I Learning II Applying what you’ve learned, build a product
recommendation algorithm that would be used in
Dimensionality Anomaly Detection an e-commerce site. Your algorithm should take
The Curse of Dimensionality Clustering Methods any combination of basket items and return an
Principal Component Analysis k-Means appropriate “You May Also Like” suggestion list to
PCA Calculation by hand k-Means++ the user.

Dimensionality II Association Rules Module 2: Fraud Detection


Thinking about Variance Association Rules Discovery Develop a PCA-based analysis using New York City’s
Linear Discriminant Analysis Market-Basket Analysis Property Valuation and Assessment Data (containing
LDA vs PCA (Unsupervised) Project Brief: Product Recommendation more than 1 million properties) to highlight any
property suspicious of fraudulent activities, such as
Unsupervised Learning deliberate undervaluation for tax purpose or
in Action exceedingly inflated valuation that should be flagged
Cluster Analysis for closer inspection.
Social Network Analysis
Market Segmentation
Time Series And Forecasting
Decomposition of time series allows us to We strongly recommend that you complete
learn about the underlying seasonality, trend the pre-requisite workshops prior to taking
and random fluctuations in a systematic this course. Some concepts presented
fashion. In this workshop, we learn the throughout the lecture may be less-than-ideal
methods to account for seasonality and trend, for practitioners who have not completed the
work with autocorrelation models and create pre-requisite courses.
industry-scale forecasts using modern tools
and frameworks.

Workshop Module 1: 3-Day Workshop Modules


Understanding Time
Series Workshop Module 2: Learn-by-Building Modules
Working with Forecasting Models
Module 1: Crime Forecast
Time Series Combine your data visualization skills with what
Additive Time Series Forecasting I
you’ve learned about forecasting to produce a
Multiplicative Time Series Exponential Smoothing
report that analyze dataset of crimes in Chicago
Charasteristics Exponential Smoothing Calculation
(2001-2017, by City of Chicago) and present your
Log-Transformation Plotting Forecasts
forecast.
Decomposition Forecasting II
Adjusting for Trend & Multiple-Seasonality
Seasonality Holt-Winters Exponential Smoothing
SMA for non-seasonal data SSE & Forecasting Errors
Two-sided SMA
Tips and Techniques
Forecasting III
More on Time Series Correlogram and Lags
Understanding lags Confidence and Prediction Interval
Autocorrelation & Partial- Tips and Techniques
autocorrelation
Stationary Time Series
Augmented Dickey-Fuller Test
Neural Network & Deep Learning
Develop artificial neural networks that can We strongly recommend that you complete
recognize face, handwriting patterns and are the pre-requisite workshops prior to taking
at the core of some of the most cutting-edge this course. Some concepts presented
cognitive models in the AI landscape. We will throughout the lecture may be less-than-ideal
learn to create a backpropagation neural for practitioners who have not completed the
network from scratch, and use our neural pre-requisite courses.
network for classification tasks. This class is
the final course in the Machine Learning
Specialization.

3-Day Workshop Modules


Workshop Module 1: Workshop Module 2: Learn-by-Building Modules
Neural Networks Deep Learning
Module 1: Image Classification
Build a neural network capable of classifying images
Understanding Neural Building a Neural into one of many classes and explain the choice of
Network Network your architecture. Test your neural network using
Neural Network Intuition Architecture Design unseen images – can your algorithm correctly
Biological Inspiration Parameterization classify 80% of images?
Layers and Neurons Cost Function
Weights and Bias Feedforward Algorithm
Backpropagation
Activation Function Refresher on Matrix Algebra
Sigmoid Function
Softmax Deep Learning
ReLu Tensor Flow
Incorporation to neuralnet Deep Learning in R
Practical Advice on Deep Learning

You might also like