0% found this document useful (0 votes)

142 views29 pages

Probabilistic Models With Latent Variables

This document discusses probabilistic models with latent variables. It introduces latent variables to model multiple modes in empirical distributions for unsupervised learning problems like density estimation. The Expectation-Maximization algorithm is presented as an approach for inference and parameter estimation in these latent variable models, including Gaussian mixture models and principal component analysis.

Uploaded by

About

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

142 views29 pages

Probabilistic Models With Latent Variables

Uploaded by

About

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 29

Probabilistic Models

with Latent Variables

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 1
Density Estimation Problem
• Learning from unlabeled data
• Unsupervised learning, density estimation

• Empirical distribution typically has multiple modes

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 2
Density Estimation Problem

From http://yulearning.blogspot.co.uk

From http://courses.ee.sun.ac.za/Pattern_Recognition_813

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 3
Density Estimation Problem
• Conv. composition of unimodal pdf’s: multimodal pdf
where

• Physical interpretation
• Sub populations

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 4
Latent Variables
• Introduce
new variable for each
• Latent / hidden: not observed in the data

• Probabilistic interpretation
• Mixing weights:
• Mixture densities:

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 5
Generative Mixture Model
•
For
𝑍𝑖

𝑋𝑖

• recovers mixture distribution

𝑁

Plate Notation

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 6
Tasks in a Mixture Model
• Inference

• Parameter Estimation
• Find parameters that e.g. maximize likelihood
• Does not decouple according to classes
• Non convex, many local minima

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 7
Example: Gaussian Mixture Model
• Model
For

• Inference

• Soft-max function

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 8
Example: Gaussian Mixture Model
• Loglikelihood
• Which training instance comes from which component?

• No closed form solution for maximizing

• Possibility 1: Gradient descent etc

• Possibility 2: Expectation Maximization

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 9
Expectation Maximization Algorithm
• Observation: Know values of easy to maximize

• Key idea: iterative updates

• Given parameter estimates, “infer” all variables
• Given inferred variables, maximize wrt parameters

• Questions
• Does this converge?
• What does this maximize?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 10
Expectation Maximization Algorithm
• Complete loglikelihood

• Problem: not known

• Possible solution: Replace w/ conditional expectation

• Expected complete loglikelihood

Wrt where are the current parameters

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 11
Expectation Maximization Algorithm
•

Where

• Compare with likelihood for generative classifier

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 12
Expectation Maximization Algorithm
• Expectation Step
• Update based on current parameters

• Maximization Step
• Maximize wrt parameters

• Overall algorithm
• Initialize all latent variables
• Iterate until convergence
• M Step
• E Step

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 13
Example: EM for GMM
• E Step remains the step for all mixture models

• M Step

• Compare with generative classifier

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 14
Analysis of EM Algorithm
• Expected complete LL is a lower bound on LL
• EM iteratively maximizes this lower bound

• Converges to a local maximum of the loglikelihood

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 15
Bayesian / MAP Estimation
• EM overfits
• Possible to perform MAP instead of MLE in M-step

• EM is partially Bayesian
• Posterior distribution over latent variables
• Point estimate over parameters

• Fully Bayesian approach is called Variational Bayes

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 16
(Lloyd’s) K Means Algorithm
• Hard EM for Gaussian Mixture Model
• Point estimate of parameters (as usual)
• Point estimate of latent variables
• Spherical Gaussian mixture components

Where

• Most popular “hard” clustering algorithm

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 17
K Means Problem
• Given
, find k “means” and data assignments such
that

• Note: is k-dimensional binary vector

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 18
Model selection: Choosing K for GMM
• Cross validation
• Plot likelihood on training set and validation set for
increasing values of k
• Likelihood on training set keeps improving
• Likelihood on validation set drops after “optimal” k

• Does not work for k-means! Why?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 19
Principal Component Analysis: Motivation
• Dimensionality reduction
• Reduces #parameters to estimate
• Data often resides in much lower dimension, e.g., on a line
in a 3D space
• Provides “understanding”

• Mixture models very restricted

• Latent variables restricted to small discrete set
• Can we “relax” the latent variable?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 20
Classical PCA: Motivation
• Revisit K-means

• W: matrix containing means

• Z: matrix containing cluster membership vectors

• How can we relax Z and W?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 21
Classical PCA: Problem
•

• X:
• Arbitrary Z of size ,
• Orthonormal W of size

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 22
Classical PCA: Optimal Solution
• Empirical covariance matrix
• Scaled and centered data
• where contains L Eigen vectors for the L largest
Eigen values of

• Alternative solution via Singular Value

Decomposition (SVD)

• W contains the “principal components” that capture

the largest variance in the data

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 23
Probabilistic PCA
• Generative model

forced to be diagonal

• Latent linear models

• Factor Analysis
• Special Case: PCA with

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 24
Visualization of Generative Process

From Bishop, PRML

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 25
Relationship with Gaussian Density
•

• Why does need to be restricted?

• Intermediate low rank parameterization of Gaussian

covariance matrix between full rank and diagonal
• Compare #parameters

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 26
EM for PCA: Rod and Springs

From Bishop, PRML

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 27
Advantages of EM
• Simpler than gradient methods w/ constraints

• Handles missing data

• Easy path for handling more complex models

• Not always the fastest method

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 28
Summary of Latent Variable Models
• Learning from unlabeled data

• Latent variables
• Discrete: Clustering / Mixture models ; GMM
• Continuous: Dimensionality reduction ; PCA

• Summary / “Understanding” of data

• Expectation Maximization Algorithm

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 29

VTU Question Paper of 18EC741 IOT and Wireless Sensor Networks Feb-2022
No ratings yet
VTU Question Paper of 18EC741 IOT and Wireless Sensor Networks Feb-2022
2 pages
Introduction to IT Systems Q&A Guide
No ratings yet
Introduction to IT Systems Q&A Guide
9 pages
Deep Learning Exam: Key Concepts
No ratings yet
Deep Learning Exam: Key Concepts
32 pages
Visualization of High Dimensional Scientific Data
No ratings yet
Visualization of High Dimensional Scientific Data
105 pages
Cloud Computing: Assignment-1
No ratings yet
Cloud Computing: Assignment-1
14 pages
Understanding Motherboard Components
No ratings yet
Understanding Motherboard Components
42 pages
Sixth Sense Technology
No ratings yet
Sixth Sense Technology
4 pages
ModuleV IoT
No ratings yet
ModuleV IoT
37 pages
Attention-Guided Multi-Granularity Fusion Model For Video Summarization
No ratings yet
Attention-Guided Multi-Granularity Fusion Model For Video Summarization
11 pages
Project Report GitHub
No ratings yet
Project Report GitHub
32 pages
Deep Multi-Scale Pyramidal Features Network For Supervised Video Summarization
No ratings yet
Deep Multi-Scale Pyramidal Features Network For Supervised Video Summarization
14 pages
Computer Programing Notes
No ratings yet
Computer Programing Notes
23 pages
Arm-7 Based Finger Print Authentication System: Volume 2, Issue 4, April 2013
No ratings yet
Arm-7 Based Finger Print Authentication System: Volume 2, Issue 4, April 2013
6 pages
Module-4 (PDFDrive)
No ratings yet
Module-4 (PDFDrive)
67 pages
Non-Classical Models of IR (Uploaded by Snaptricks - In)
No ratings yet
Non-Classical Models of IR (Uploaded by Snaptricks - In)
8 pages
Credit Card Fraud Detection Seminar
No ratings yet
Credit Card Fraud Detection Seminar
16 pages
SMART QUILL Abstract
100% (1)
SMART QUILL Abstract
1 page
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
Adhoc and Wireless Sensor Networks File Lab
No ratings yet
Adhoc and Wireless Sensor Networks File Lab
61 pages
Internet 02
No ratings yet
Internet 02
6 pages
BMP180 Pressure Sensor Arduino Guide
100% (1)
BMP180 Pressure Sensor Arduino Guide
29 pages
OS Installation & CPU Scheduling Lab
No ratings yet
OS Installation & CPU Scheduling Lab
59 pages
ISR Question For Oral Exam
No ratings yet
ISR Question For Oral Exam
23 pages
Vector (Array) Processing and Superscalar Processors
No ratings yet
Vector (Array) Processing and Superscalar Processors
7 pages
Efficient Crop Yield Prediction Using ML
No ratings yet
Efficient Crop Yield Prediction Using ML
4 pages
Random Access Protocol
No ratings yet
Random Access Protocol
18 pages
Data Mining & Deep Learning Question Bank
No ratings yet
Data Mining & Deep Learning Question Bank
12 pages
Data Analytics For Ioe: Syllabus
No ratings yet
Data Analytics For Ioe: Syllabus
23 pages
The Von Neumann
No ratings yet
The Von Neumann
5 pages
Week 1 Solution
No ratings yet
Week 1 Solution
4 pages
Deepfake Voice Synthesis Framework
No ratings yet
Deepfake Voice Synthesis Framework
24 pages
IOT Enabling Technologies
No ratings yet
IOT Enabling Technologies
5 pages
Introduction to Embedded Systems ESD
No ratings yet
Introduction to Embedded Systems ESD
142 pages
The Evaluation of Operating System
No ratings yet
The Evaluation of Operating System
6 pages
Experiment No. 2 Web Page Using HTML5 Title: Objective
No ratings yet
Experiment No. 2 Web Page Using HTML5 Title: Objective
34 pages
Ieee Anpr
No ratings yet
Ieee Anpr
6 pages
FSD Module 5 Notes
No ratings yet
FSD Module 5 Notes
13 pages
IoT-Unit 2-Part 3-OGC Architecture
No ratings yet
IoT-Unit 2-Part 3-OGC Architecture
6 pages
Seminar On Deep CNN
No ratings yet
Seminar On Deep CNN
36 pages
Database Management Systems Lecture Notes
No ratings yet
Database Management Systems Lecture Notes
144 pages
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
61 pages
Spiro Prime Tech Services Projects List
No ratings yet
Spiro Prime Tech Services Projects List
41 pages
Madhuri 2214877122 Projectppt
No ratings yet
Madhuri 2214877122 Projectppt
13 pages
Circuit vs. Packet Switching Explained
No ratings yet
Circuit vs. Packet Switching Explained
33 pages
Sri Indu College CSE Syllabus Overview
No ratings yet
Sri Indu College CSE Syllabus Overview
25 pages
Voice-Controlled Home Automation IoT
No ratings yet
Voice-Controlled Home Automation IoT
43 pages
Unit 5
No ratings yet
Unit 5
11 pages
Switching Techniques Event Timing
No ratings yet
Switching Techniques Event Timing
2 pages
Unit 5
No ratings yet
Unit 5
23 pages
BFS Greedybfs Astar Search Techniques in AI Difference and Details
No ratings yet
BFS Greedybfs Astar Search Techniques in AI Difference and Details
2 pages
Unit V
No ratings yet
Unit V
67 pages
Development Engineering Notes 5 Unit
No ratings yet
Development Engineering Notes 5 Unit
34 pages
Chapter 1 Random Processes and Noise
No ratings yet
Chapter 1 Random Processes and Noise
32 pages
BCA 4 Fundamental of IOT
No ratings yet
BCA 4 Fundamental of IOT
18 pages
Difference Between Guided and Unguided Media: Comparison Chart
No ratings yet
Difference Between Guided and Unguided Media: Comparison Chart
2 pages
Inference Theory in Logic Explained
No ratings yet
Inference Theory in Logic Explained
17 pages
Database Access From JSP Page
No ratings yet
Database Access From JSP Page
18 pages
GTU Winter Exam Question Banks
100% (1)
GTU Winter Exam Question Banks
8 pages
Digital Electronics 2-1
No ratings yet
Digital Electronics 2-1
233 pages
Probabilistic Models For Classification
No ratings yet
Probabilistic Models For Classification
32 pages
Peer Evaluation Group 6 Bsedmt 2 1
No ratings yet
Peer Evaluation Group 6 Bsedmt 2 1
3 pages
Founder OS
No ratings yet
Founder OS
29 pages
Seismic Waves 1
No ratings yet
Seismic Waves 1
7 pages
Dr. James Maxlow: Expansion Tectonics Pioneer
No ratings yet
Dr. James Maxlow: Expansion Tectonics Pioneer
99 pages
Herd Instincts in War and Peace
No ratings yet
Herd Instincts in War and Peace
3 pages
Nift Question Paper 06
No ratings yet
Nift Question Paper 06
10 pages
Durshal Sample PDF
No ratings yet
Durshal Sample PDF
4 pages
Jareb, Mario - Illusions of A 'Final Victory' and The 'Fate of Small European Nations'. Media and Propaganda of The Independent State of Croatia in 1945
No ratings yet
Jareb, Mario - Illusions of A 'Final Victory' and The 'Fate of Small European Nations'. Media and Propaganda of The Independent State of Croatia in 1945
12 pages
Leonardo's Geometric Mastery
No ratings yet
Leonardo's Geometric Mastery
10 pages
Account Statement From 1 Oct 2024 To 9 Jan 2025: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
No ratings yet
Account Statement From 1 Oct 2024 To 9 Jan 2025: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
6 pages
Al Ash'ari
No ratings yet
Al Ash'ari
8 pages
Homeroom Guidance Module 1
No ratings yet
Homeroom Guidance Module 1
22 pages
2021 Civpro MT PE 2
No ratings yet
2021 Civpro MT PE 2
4 pages
Screenshot 2024-10-06 at 6.10.21 PM
No ratings yet
Screenshot 2024-10-06 at 6.10.21 PM
5 pages
Advanced Trading Strategies Guide
No ratings yet
Advanced Trading Strategies Guide
70 pages
X-Ray Fluorescence Spectroscopy For Laboratory Applications Michael Haschke PDF Download
100% (1)
X-Ray Fluorescence Spectroscopy For Laboratory Applications Michael Haschke PDF Download
68 pages
The Slave Trade and The Origins of Mistrust in Africa: Citation
No ratings yet
The Slave Trade and The Origins of Mistrust in Africa: Citation
33 pages
Lte and The Evolution To 4g Wireless Design and Measurement Challenges Second Edition Second Edition 2nd Edition Moray Rumneyauth 2025 PDF Download
100% (3)
Lte and The Evolution To 4g Wireless Design and Measurement Challenges Second Edition Second Edition 2nd Edition Moray Rumneyauth 2025 PDF Download
146 pages
Toys R Us: Eyler's Strategic Decisions
No ratings yet
Toys R Us: Eyler's Strategic Decisions
13 pages
Sassy Seniors Scholarship Application 2019
No ratings yet
Sassy Seniors Scholarship Application 2019
5 pages
NCM 103
No ratings yet
NCM 103
5 pages
Effects of Orthodontic Treatment On Nutrient Intake
No ratings yet
Effects of Orthodontic Treatment On Nutrient Intake
8 pages
Story Children - Samuel
No ratings yet
Story Children - Samuel
2 pages
Hemodynamic Monitoring
100% (1)
Hemodynamic Monitoring
44 pages
4 Blades Propellers
No ratings yet
4 Blades Propellers
12 pages
FIELD STUDY Episode 4 TOOLS OF THE TRADE
No ratings yet
FIELD STUDY Episode 4 TOOLS OF THE TRADE
5 pages
(Critical Studies in German Idealism 19) Stéphane Symons-The Marriage of Aesthetics and Ethics-Brill Academic Publishers (2015)
100% (2)
(Critical Studies in German Idealism 19) Stéphane Symons-The Marriage of Aesthetics and Ethics-Brill Academic Publishers (2015)
315 pages
Angel M. Rodríguez - Baptismal Instruction in The New Testament and Other Related Issues
No ratings yet
Angel M. Rodríguez - Baptismal Instruction in The New Testament and Other Related Issues
4 pages
L1 - Patterns of Development in Writing Across Disciplines
No ratings yet
L1 - Patterns of Development in Writing Across Disciplines
38 pages
Machine Shop Math
100% (1)
Machine Shop Math
127 pages

Probabilistic Models With Latent Variables

Uploaded by

Probabilistic Models With Latent Variables

Uploaded by

Probabilistic Models

with Latent Variables

• Empirical distribution typically has multiple modes

• recovers mixture distribution

• No closed form solution for maximizing

• Possibility 1: Gradient descent etc

• Key idea: iterative updates

• Problem: not known

• Expected complete loglikelihood

Wrt where are the current parameters

• Compare with likelihood for generative classifier

• Compare with generative classifier

• Converges to a local maximum of the loglikelihood

• Fully Bayesian approach is called Variational Bayes

• Most popular “hard” clustering algorithm

• Note: is k-dimensional binary vector

• Does not work for k-means! Why?

• Mixture models very restricted

• W: matrix containing means

• How can we relax Z and W?

• Alternative solution via Singular Value

• W contains the “principal components” that capture

• Latent linear models

From Bishop, PRML

• Why does need to be restricted?

• Intermediate low rank parameterization of Gaussian

From Bishop, PRML

• Handles missing data

• Easy path for handling more complex models

• Not always the fastest method

• Summary / “Understanding” of data

• Expectation Maximization Algorithm

You might also like