Machine Learning AND Deep Learning for OpenPOWER

OpenPOWER Webinar Series
Machine Learning and
Deep Learning 101
Clarisse Taaffe Hedglin
clarisse@us.ibm.com
Executive AI Architect
IBM Systems

2
Please note IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice
and at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product direction and it should
not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment, promise, or legal
obligation to deliver any material, code or functionality. Information about potential future products may not be
incorporated into any contract.
The development, release, and timing of any future features or functionality described for our products remains
at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled
environment. The actual throughput or performance that any user will experience will vary depending upon
many factors, including considerations such as the amount of multiprogramming in the user’s job stream,
the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be
given that an individual user will achieve results similar to those stated here.

Session Objectives
Introducing Machine Learning
and Deep Learning (ML/DL) in
the context of AI and analytics
Understanding the iterative
nature of the workflow
Getting an overview of different
ML/DL algorithms
Interpreting models and
outcomes

• No governance
• No collaboration
• Limited complexity
How Customers Do Data Analytics Traditionally
Spreadsheets
• Broad rules and categories
• Not dynamic
Business Rules
• Hard to maintain
• Pre-set rules and
approaches
Homegrown
Applications
• Limited use of analytics
• Hard coded models that do
not apply to unique needs
• Slow response
Other Applications

Machine Learning Context
REINFORCEMENT
LEARNING

Machine Learning Definition
6
“…an application of artificial
intelligence (AI) that provides systems
the ability to automatically learn and
improve from experience without
being explicitly programmed.”

7
From Data to Actions
010101010101010111100010011001010111
0000000000010101010100000000000 111101011
11000 000000000000 111111 010101 101010
10101010100
Prescriptive
What should
we do ?
Descriptive
What Has
Happened?
Cognitive
Learn
Dynamically
Predictive
What Will
Happen?
ACTIONDATA
HUMAN INPUTS
<
< >
< >
>c
c
c
c >

Machine Learning Flow
Credit card transaction
Loan application
MRI image
House data
Fraudulent vs. legitimate
Approve vs. reject
Tumor benign vs. malignant
House appraisal value
Mathematical Function
Not a memorization caching system
Representing pattern by a mathematical function
Machine learning is just a bunch of math
prediction

Data – Estimate House Price
Every column except last is a feature
Last column is a label
This is a labeled data set
Sq Ft Bedroom Bathroom Price
2000 3 2 $350,000
1500 2 2 $280,000
2200 3 3 $400,000
… … … …

With labeled data, it is called supervised machine learning.
What if we don’t have labeled data?
–It’s unsupervised learning
–Objective is forming clusters based on data
Machine Learning Categories
Customer
Revenue
Customer
Profit
# of Online
Purchases
# of Store
Purchases
…

Unstructured, Landing, Exploration and Archive
Operational Data
Real-time Data Processing & Analytics
Transaction and
application data
Machine,
sensor data
Enterprise
content
Image, geospatial,
video
Social data
Third-party data
Information Integration & Governance
Data is Prerequisite to AI
Risk, Fraud
Chat bots,
personal
assistants
Supply Chain
Optimization
Dynamic
Pricing,
Recommenders
Behavior
Modeling
Vision,
Autonomous
Systems

12
Iterative Process of Machine Learning
Define
Problem
Prepare
Data
Train
Model
Evaluate
Model
Fine-Tune
Model
Deploy
Model

Prior to performing machine
learning, identify foundational
knowledge of the problem you
are trying to solve.
Articulate use case, and identify
problem
What type of supporting data is
available?
How large is the data?
Identify response time/throughput
characteristics
Business Understanding

Predict a
Future Event
Segment Data
/ Detect
Anomalies
Determine
optimal
quantity,
price,
resource
allocation, or
best action
Understand
Past Activity
Discover
Insights in
Content
(text, images,
video)
Interact in
Natural
Language
Forecast
and Budget
based on
past activity
Supervised Unsupervised
Predictive: What will happen? Prescriptive:
What should
we do?
Descriptive:
What
happened?
Planning:
What is our
Plan?
NLPDeep Learning
Supervised
Common Patterns of Analytics Business Problems
Solving business problems with Data and AI
will utilize a combination of these analytics patterns

15
Learning to Map Input to Output
Input Output Application
Customer behaviour Responder (1/0) Target marketing
Banking transactions Fraud (1/0) Fraud Detection
Call Data Record Churn (1/0) Customer retention
Image Object/Caption
(1,…1000)
Object Detection
Audio Text Transcript Speech recognition
Arabic English Machine translation

Visualization can give us an intuition about
the relationships between the data
Can be used to find clusters and patterns
in data
Can be used to understand distributions in
data (Univariate statistics)
Correlations and Bivariate statistics
Many graphical packages available
Prepare Data - Visualization

Feature engineering
Feature Selection
What to do with missing values
How to handle non numeric types
Feature A Feature B Feature C Derived
Feature A
Derived
Feature B
Target /
Label
0 Sentence
1
f1(A,B) f2(A,B) category1
NaN Sentence
2
category2
It has been stated, 80% of a data scientists time is spent in data preparation
… here’s why (even after data is identified/obtained)
More derived features (binary encoded)
Prepare Data
What is the best
representation of
sample data for more
accurate prediction?

An input image of size 256 x 256 contains 65,536 pixels
Each pixel is a feature in the feature vector
Highly computationally intensive due to large number of features
Image Data Representation
x11 x12 … x1, 65536
x21 x22 … x2, 65536
x31 x32 … x3, 65536
… … … …
xn1 xn2 xn3 xn, 65536
x1
x2
x3

Train / Test Data Split
Break input data
All data
Training
Cross Validation
Test
Random
Sample
1. Training data used to build models
2. Cross validation set used to evaluate
similar versions of models
3. Test set is used to inference results
to evaluate the quality of the model

Model Objective
Use training data to derive f(x)
so that:Cost (Actual - f(x))
Minimize ( Actual - Prediction )
Every computational iteration
analyzes the entire training data set
Process of minimizing this cost
function is called training

21
Select Machine Learning Algorithms
Supervised Learning
Logistic Regression
Decision trees
Random forests
Neural Network (Deep Learning)
Bayesian Techniques
Support Vector Machines
Ensemble Methods
Markov Logic Network
Unsupervised Learning
K-Means
Hierarchical Clustering
Anomaly Detection
Density-based methods
Principal Component Analysis

Logistic Regression
Classification system
Medical image - tumor benign or
malignant
Credit card transaction - normal or
fraudulent
Customer churning - yes or no
Email - normal or spam
Language identification - English vs.
French vs. Spanishx1
x2
f(x)
Map input to one of the output categories

Linear Regression
Model output is in continuous real number space
Could be multi-dimensional
f(x)

K-Means
x1
Randomly choose 3 data points as
centroids
For each data point, assign them to
one of the groups based on distance
from the centroids
Recompute centroids in each group
Reassign each data point
Repeat until convergence
Identifies patterns and clusters

Deep Learning
Convolutional Neural Network (CNN)

Recurrent Neural Network (RNN)
Deep Learning

28
Deep learning (DL) framework invented and open sourced
by Google
Based on the notion of tensors which are multi dimensional
arrays of numbers
Implements a number of functions that are common to all
deep learning workflows (optimizers, back propagation,etc)
Programming Model : User defines neural network as a
graph, and then user “feeds” data to the network to either
train or perform inference
Most widely adopted platform in the DL universe as of today
One of the best documented frameworks
TensorFlow – a quick review

IBM Internal Use Only
PYTORCH – a quick review
2
Facebook’s framework for research
• Cousin of LUA based Torch framework,
but was rewritten to be tailored to
Python frontend
• Gaining popularity quickly for its ease
of use in R&D
• Supports dynamic computation graphs
• Based on Python with Numpy
compatibility
• Multi-GPU
• Easy to use, and supports standard
debug tools

30
Transfer Learning
Useful because you can use
pretrained networks that might
have taken weeks to train
Useful because early layers are
trained to distinguish coarser
features
Typically final layer is removed
for a new problem and network is
retrained using new data
Applying learning acquired in one domain to a problem of a similar domain

IBM Internal Use Only
Definitions:
Scratch,
Finetune,
Feature Extract
31
• Scratch: training all
parameters yourself
starting from Random
• Finetune: training many
or most parameters
yourself but retain some
prior weights
• Feature Extract = Pre-
Trained: Only train the
vary last classification
layers

32
Agent w/
DNN
Environment
reward
Rt
state
St
St+1
Rt+1
action
At
Reinforcement Learning
Maximizes long term reward
• Maximize a dimension over
many steps
• Correlates immediate actions
with delayed benefits
Requires a lot of data to learn;
often used with a simulator
Use Cases:
Robotics, Stock pricing, Game
adversaries

33
Machine
Learning
Algorithms
Input Feature
Extraction
Features Classification Output
Machine Learning
Input Output
Feature Extraction
& Classification
Deep Neural Network
Deep Learning
Model Interpretation

34
Humans
5% Error26% Errors
Machine Learning Based
3% Errors
Deep Learning Based
20162011
Evaluation of Model Accuracy

37
Model Goal – Does it Help my Business ?
Targetsin%
Optimal
Random
Model
5010
100
100
Population in % (ranked in descending order of scores)
40
90
100
10

38
Model Output Should be Easy to Consume
In a Call Center In a Mobile App
Deployment is the idea of making the insights available to application
developers, consumers, and business users

Packaged Goods Quality Inspection with Visual Insights
Manufacturer Inventory of
Manufactured Goods
Retailer
Shelf Inventory CheckPackage CheckInventory Management
Inspection Points
Quality Check
Images
Videos
Visual Inspector
Industrial Cameras
❶Upload
images/videos
❷Train models
❸Inference
IBM Maximo Visual Insights
Supply Chain
Solution
Consumer
Packaged Goods
*No Data Scientist required
QUALITY INSPECTION

Actionable
Decisions
Image
Sensor
Database
Text
Data
Fusion
Sources
Fact
Tables
AI &
Analytics
Language
Analytics
Classification
Detection
Time
Series
Descriptive
Statistics
OpenPOWER based NoviLens AI Appliance
Under the Hood – End to End Workflow
40© 2020 NoviSystems Corporation
Questions
?
** TechData IBM Reseller in
Research Triangle Park
*Patents pending on the
methods employed by
NoviLens
DATA FUSION

AI demands purpose-built
infrastructure throughout the journey
•Data preparation
•Model development environment
•Runtime environment
•Train, deploy and manage models
•Business KPI, production metrics
•Explainability and fairness

Enterprise Data Pipeline with IBM Spectrum Storage
42
Insights Out
Trained Model
Inference
Data In
Transient Storage
SDS/Cloud
Global Ingest
Throughput-oriented,
globally accessible
Cloud
ETL
High throughput, Random
I/O,
SSD/Hybrid
Archive
High scalability, large/sequential I/O
HDD Cloud
Tape
Hadoop / Spark
Data Lakes
Throughput-oriented
Hybrid/HDD
ML / DL
Prep ⇨ Training ⇨ Inference
High throughput, low
latency,
Random I/O
SSD/NVMe
Classification &
Metadata Tagging
High volume, index &
auto-tagging zone
Fast Ingest /
Real-time Analytics
High throughput
SSD
Throughput-oriented,
software defined
temporary landing zone
capacity tier
performance tier performance &
capacity Tier
performance &
capacity Tier
performance tier
capacity tier
Fits Traditional and New Use Cases
EDGE INGEST ORGANIZE ANALYZE INSIGHTSML / DL
IBM Spectrum Scale / Storage for AI / © 2020 IBM Corporation

Machine Learning is computation
Understand the problem to solve
Supervised vs. unsupervised learning
Feature engineering - know your data
Iterative methodology
Greater accuracy with Deep Learning
Value gained in deployments
Infrastructure matters
In summary

Co-Creation Lab
Work side-by-side with IBM
data scientists, cloud and
infrastructure experts
Build Custom AI Solution
Plan for a pre-built Solution
Design a scalable, industrial
strength, training &
inferencing enterprise
platform
IBM Systems WW Client Experience Centers: AI Center of Competency
Fast Start / Design Workshops
Discovery Workshop
Overview Technologies
Understand Business
Challenges
Prioritize Use Cases
Showcases / Demos
AI Immersion
Experience
AI Consulting
Collateral / Assets
Field Support
Workshops & Co-Creation Labs
DeployAdoptLearn
IBM Systems ML/DL Hands-On
and Customizable
Combination of Lectures
And Labs
IBM or Customer Location
IBM Portfolio and
Open Source Tools
Contact us at aicoc@us.ibm.com
Learn it Design it Build it Together Use it
Fast Start / Design
Workshops

Machine Learning AND Deep Learning for OpenPOWER

More Related Content

What's hot (20)

Similar to Machine Learning AND Deep Learning for OpenPOWER (20)

More from Ganesan Narayanasamy (20)

Recently uploaded (20)

Machine Learning AND Deep Learning for OpenPOWER