0% found this document useful (0 votes)

3 views33 pages

Regression

The document discusses Linear Discriminant Analysis (LDA) as a method for classification in machine learning, comparing likelihood-based and discriminant-based approaches. It covers the formulation of linear and quadratic discriminants, the geometry of classification, and the application of logistic regression for binary and multi-class problems. Additionally, it highlights the use of gradient descent for training models and the importance of estimating decision boundaries rather than densities.

Uploaded by

Alekh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views33 pages

Regression

Uploaded by

Alekh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 33

CSCE833 Machine Learning

Lecture 9
Linear Discriminant Analysis

Dr. Jianjun Hu
mleg.cse.sc.edu/edu/csce833

University of South Carolina

Department of Computer Science and Engineering
Likelihood- vs. Discriminant-
based Classification
Likelihood-based: Assume a model for p(x|
Ci), use Bayes’ rule to calculate P(Ci|x)
gi(x) = log P(Ci|x)
Discriminant-based: Assume a model for
gi(x|Φi); no density estimation
Estimating the boundaries is enough; no
need to accurately estimate the densities
inside the boundaries

Lecture Notes for E Alpaydın 2004

2 Introduction to Machine Learning © The
MIT Press (V1.1)
Example: Boundary or Class
Model

Lecture Notes for E Alpaydın 2004

3 Introduction to Machine Learning © The
MIT Press (V1.1)
Linear Discriminant

Linear discriminant:
d
gi x | wi ,wi 0  wiT x  wi 0   wij x j  wi 0
j 1
Advantages:
Simple: O(d) space/computation
Knowledge extraction: Weighted sum of attributes;
positive/negative weights, magnitudes (credit scoring)
Optimal when p(x|Ci) are Gaussian with shared cov
matrix; useful when classes are (almost) linearly
separable
Lecture Notes for E Alpaydın 2004
4 Introduction to Machine Learning © The
MIT Press (V1.1)
Generalized Linear Model

Quadratic discriminant:

gi x | Wi ,wi ,wi 0   xT Wi x  wiT x  wi 0

Higher-order (product) terms:

z1 x1 , z 2 x2 , z 3 x12 , z 4 x22 , z5 x1x2

Map from x to z using nonlinear basis functions

and use a linear discriminant in z-space
k
gi x   wij  j x
Lecture Notes for E Alpaydın 2004 j 1
5 Introduction to Machine Learning © The
MIT Press (V1.1)
Two Classes
g x g1 x  g2 x
  
 w1T x  w10  w2T x  w 20 
 w1  w2  x  w10  w 20 
T

wT x  w 0

C 1 if g x  0
choose 
C 2 otherwise

Lecture Notes for E Alpaydın 2004

6 Introduction to Machine Learning © The
MIT Press (V1.1)
Geometry

Lecture Notes for E Alpaydın 2004

7 Introduction to Machine Learning © The
MIT Press (V1.1)
Multiple Classes
gi x | wi ,wi 0  wiT x  wi 0

Choose C i if
K
gi x  max g j x
j 1

Classes are
linearly separable

Lecture Notes for E Alpaydın 2004

8 Introduction to Machine Learning © The
MIT Press (V1.1)
Pairwise Separation
g x | w ij ij ,w ij 0  wT
ij x  w ij 0

 0 if x  C i

gij x   0 if x  C j
don' t care otherwise


choose C i if
j i ,gij x  0

9
Lecture Notes for E Alpaydın 2004 Introduction to Machine
From Discriminants to
Posteriors

When p (x | Ci ) ~ N ( μi , ∑)
  T
gi x | wi ,wi 0 wi x  wi 0
1 T 1
1
wi   μi wi 0   μi  μi  log P C i 
2
y  P C 1 | x and P C 2 | x 1  y
 y  0.5

chooseC 1 if  y / 1  y   1 and C 2 otherwise
log y / 1  y   0

Lecture Notes for E Alpaydın 2004
10 Introduction to Machine Learning © The
MIT Press (V1.1)
P C 1 | x P C 1 | x
logitP C 1 | x  log  log
1  P C 1 | x P C 2 | x
p x | C 1  P C 1 
 log  log
p x | C 2  P C 2 
2 d / 2   1 / 2 exp 1 / 2x  μ 1    1 x  μ 1 
T
  log P C 
1
 log
2 d / 2   1 / 2 exp 1 / 2x  μ 2    1 x  μ 2
T
 P C 
2

wT x  w 0
1
where w    1 μ 1  μ 2  w 0   μ 1  μ 2 T   1 μ 1  μ 2 
2
The inverse of logit
P C 1 | x
log wT x  w 0
1  P C 1 | x
1

P C 1 | x  sigmoid wT x  w 0     
1  exp  wT x  w 0

11
Lecture Notes for E Alpaydın 2004 Introduction to Machine
Sigmoid (Logistic) Function

1. Calculateg x wT x  w 0 and chooseC 1 if g x  0, or

2. Calculate y for
Lecture Notes Esigmoid 
Alpaydın 2004
Introduction to Machine Learning © The

wT x  w 0 and chooseC 1 if y  0.5
12
MIT Press (V1.1)
Gradient-Descent

E(w|X) is error with parameters w on sample X

w*=arg minw E(w | X)
T
Gradient  E E E 
w E   , ,..., 
 w 1 w 2 w d 

Gradient-descent:
Starts from random w and updates w iteratively
in the negative direction of gradient

Lecture Notes for E Alpaydın 2004

13 Introduction to Machine Learning © The
MIT Press (V1.1)
Gradient-Descent
E
w i    ,i
wi
wi wi  wi

E (wt)

E (wt+1)

wt wt+1
14
η
Lecture Notes for E Alpaydın 2004
Introduction to Machine Learning © The
MIT Press (V1.1)
Logistic Discrimination

Two classes: Assume log likelihood ratio is

1
y  P̂ C 1 | x 
Lecture Notes for E Alpaydın 2004
 
1  exp  wT x  w 0 
15 Introduction to Machine Learning © The
MIT Press (V1.1)
Training: Two Classes

X  xt ,r t t r t | xt ~ Bernoulli y t  
1
y  P C 1 | x 
 
1  exp  wT x  w 0 
   
| X   y  1  y 
t t

l w,w 0 t r t 1 r

E   logl

E w,w 0 | X    r t log y t  1  r t log 1  y t   
t

Lecture Notes for E Alpaydın 2004

16 Introduction to Machine Learning © The
MIT Press (V1.1)
Training: Gradient-Descent
  
E w,w 0 | X    r t log y t  1  r t log 1  y t 
t

dy
If y sigmoid a  y 1  y 
da
E  r t 1 r t  t
w j   
w j
   t  
t 
y 1  
y t
x t
j 
t  y 1 y 
 
  r t  y t xtj , j 1,...,d
t

E
w 0   
w 0
  r t  y t  
t
Lecture Notes for E Alpaydın 2004
17 Introduction to Machine Learning © The
MIT Press (V1.1)
Lecture Notes for E Alpaydın 2004
18 Introduction to Machine Learning © The
MIT Press (V1.1)
100 1000

Lecture Notes for E Alpaydın 2004

19 Introduction to Machine Learning © The
MIT Press (V1.1)
K>2 Classes
X x ,r  r | x ~ Mult 1, y 
t t
t
t t
K
t

p x | C i 
log wiT x  wio0
p x | C K  softmax

y  P̂ C i | x 

exp wiT x  wi 0  ,i 1,...,K
 j 1
K
exp wT
j x wj0  
 
 
t

l wi ,wi 0 i | X   y t ri

i
t i

E wi ,wi 0 i | X    rit log yit

wLecture
j   r
Notes

 forj E Alpaydın
t
 y t
j xt
 
w
2004 j 0
  j j
r t
 y t
 
20 t to Machine Learning © The
Introduction t
MIT Press (V1.1)
Lecture Notes for E Alpaydın 2004
21 Introduction to Machine Learning © The
MIT Press (V1.1)
Example

Lecture Notes for E Alpaydın 2004

22 Introduction to Machine Learning © The
MIT Press (V1.1)
Generalizing the Linear
Model
 Quadratic:
p x | C i 
log  xT Wi x  wiT x  wi 0
p x | C K 
 Sum of basis functions:
p x | C i 
log wiT x  wi 0
p x | C K 

where φ(x) are basis functions

 Kernels in SVM
 Hidden units in neural networks

Lecture Notes for E Alpaydın 2004

23 Introduction to Machine Learning © The
MIT Press (V1.1)
Discrimination by Regression

Classes are NOT mutually exclusive and

r t  y t   where  ~ N 0,  2 
exhaustive
1
t

y  sigmoid w x  w 0 T t
 
  
1  exp  wT xt  w 0

l w,w 0 | X   
1 
exp 
r  y  
t t 2

t 2  2 2 
1
E w,w 0 | X    r t  y t
2 t
 2

w  Notes
Lecture 
 for
r t EAlpaydın
y t y t 2004  
1  yt x t

24 Introductiont to Machine Learning © The

MIT Press (V1.1)
Optimal Separating
Hyperplane

  1 if xt
 C1

t t
 t
X  x ,r t wherer   t
  1 if x  C2
find w and w 0 such that
wT xt  w 0  1 for r t  1
wT xt  w 0  1 for r t   1
which can be rewritten as
 
r t wT xt  w 0  1

(Cortes and Vapnik, 1995; Vapnik, 1995)

Distance from the discriminant to the closest

instances on either side T t
Distance of x to the hyperplane is w x  w0
w
t

r w x  w0T t
 ,t

We require w

For a unique sol’n, fix ρ||w||=1and to max

1
margin 2
 
min w subject to r t wT xt  w 0  1,t
2
Lecture Notes for E Alpaydın 2004
26 Introduction to Machine Learning © The
MIT Press (V1.1)
Lecture Notes for E Alpaydın 2004
27 Introduction to Machine Learning © The
MIT Press (V1.1)
1 2
min w subject to r t wT xt  w 0  1,t
2
 
1
   
N
2
L p  w    t r t wT xt  w 0  1
2 t 1

1 N N
2
 w 
2
  r w x  w t
0  
t
 
t T t

t 1 t 1

L p N
 0  w    t r t xt
w t 1

L p N

w 0
0   r 0
 t t

t 1

Lecture Notes for E Alpaydın 2004

28 Introduction to Machine Learning © The
MIT Press (V1.1)
1 T
 
Ld  w w  wT   tr t xt  w 0   tr t    t
2 t t t

1 T

 w w    t
2

t

1 t T s
     r r x x    t
2 t s
t s t s
 
t

subject to   tr t 0 and  t 0,t

Most αt are 0 and only a small number have αt

>0; they are the support vectors

Lecture Notes for E Alpaydın 2004

Not linearly separable

 
r t wT xt  w 0 1  t

Soft error

t
t

New primal is

1 2
   
Lp  w  C t t  t t r t wT xt  w 0  1  t  t t t
2 Lecture Notes for E Alpaydın 2004
30 Introduction to Machine Learning © The
MIT Press (V1.1)
Kernel Machines

Preprocess input x by basis functions

z = φ(x) g(z)=wTz
g(x)=wT φ(x)
The SVM solution
w   tr t z t   tr t φxt  
t t

g x w φx    r φx
T t t
  φx
t T


g x   tr t K xt , x 
t
Lecture Notes for E Alpaydın 2004
31 Introduction to Machine Learning © The
MIT Press (V1.1)
Kernel Functions

Polynomials of degree q: K x , x  x x  1  t
  T t

q


K x, y   x y  1
T
 2

 x1y1  x2y 2  1
2

1  2x1y1  2x2y 2  2x1x2y1y 2  x12y12  x22y 22


x  1, 2x1 , 2x2 , 2x1x2 ,x ,x2
1
2
2 
T

 xt  x 2 
Radial-basis functions:  

K xt , x exp 
 2


Sigmoidal functions:  
K xt , x  tanh2xT xt  1

32
(Cherkassky and Mulier, 1998)
Lecture Notes for E Alpaydın 2004
Introduction to Machine Learning © The
MIT Press (V1.1)
SVM for Regression

Use a linear model (possibly kernelized)

f(x)=wTx+w0
Use the є-sensitive error function
0 if r t
 f xt
 

t t

e r ,f x   t  
t
 
 r  f x   otherwise
1
min

2
2
w  C  t  t  
t
 
r t  wT x  w 0    t
w x  w  r
T
0
t
   t
t Notes
t for E Alpaydın 2004
33
 ,  0
Lecture
  to Machine Learning © The
Introduction
MIT Press (V1.1)

Deisenroth Faisal Ong MathMLbook
No ratings yet
Deisenroth Faisal Ong MathMLbook
417 pages
Movement, Pressure and Density - Checkpoint Physics Questions 1
100% (4)
Movement, Pressure and Density - Checkpoint Physics Questions 1
21 pages
Three Dimensional Geometry (Questions)
No ratings yet
Three Dimensional Geometry (Questions)
7 pages
11 Ethem Linear SVM 2015
No ratings yet
11 Ethem Linear SVM 2015
66 pages
Unit 3 Linear Discrimination
No ratings yet
Unit 3 Linear Discrimination
14 pages
i2ml2e-chap5-v1-0
No ratings yet
i2ml2e-chap5-v1-0
26 pages
i2ml2e-chap4-v1-0
No ratings yet
i2ml2e-chap4-v1-0
27 pages
I2ml Chap3 v1 1
No ratings yet
I2ml Chap3 v1 1
23 pages
Dimensionality Reduction Lecture Slide
No ratings yet
Dimensionality Reduction Lecture Slide
27 pages
Chap13 KernelMachines
No ratings yet
Chap13 KernelMachines
24 pages
Kernel Machines
No ratings yet
Kernel Machines
33 pages
ML-chap10_2024_110300
No ratings yet
ML-chap10_2024_110300
29 pages
Multi Layer Perceptron Annotated
No ratings yet
Multi Layer Perceptron Annotated
53 pages
Unit 3 Kernel Machines
No ratings yet
Unit 3 Kernel Machines
24 pages
AIML M-4 & M-5 (1)
No ratings yet
AIML M-4 & M-5 (1)
67 pages
Lec 10
No ratings yet
Lec 10
49 pages
Intro
No ratings yet
Intro
24 pages
Machine Learning Chapter 1
No ratings yet
Machine Learning Chapter 1
25 pages
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
No ratings yet
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
28 pages
Machine Figure
No ratings yet
Machine Figure
153 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
32 pages
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
No ratings yet
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
30 pages
ML and Its Application
No ratings yet
ML and Its Application
13 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
Lecture 3 Deep Learning
No ratings yet
Lecture 3 Deep Learning
98 pages
Non Parametric Methods 8
No ratings yet
Non Parametric Methods 8
23 pages
I2ml3e Chap10
No ratings yet
I2ml3e Chap10
27 pages
Introduction To Machine Learning Lecture 3: Linear Classification Methods
No ratings yet
Introduction To Machine Learning Lecture 3: Linear Classification Methods
40 pages
Deep Learning - A Gentle Introduction
No ratings yet
Deep Learning - A Gentle Introduction
100 pages
lec05
No ratings yet
lec05
46 pages
Machine Learning SVM
No ratings yet
Machine Learning SVM
6 pages
Lecture MachineLearning
No ratings yet
Lecture MachineLearning
139 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
Cours1 ML
No ratings yet
Cours1 ML
41 pages
Discriminant Functions
No ratings yet
Discriminant Functions
33 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
57 pages
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
No ratings yet
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
5 pages
Machine Learning The Basics
No ratings yet
Machine Learning The Basics
158 pages
06 Lectureslides LinearClassification Fixed
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
98867064
No ratings yet
98867064
53 pages
ML Supervised Unsupervised Learning algorithm
No ratings yet
ML Supervised Unsupervised Learning algorithm
18 pages
I2ml Chap16 v1 1
No ratings yet
I2ml Chap16 v1 1
21 pages
Mathematical Foundations
No ratings yet
Mathematical Foundations
431 pages
i2ml-chap1-v1-1
No ratings yet
i2ml-chap1-v1-1
19 pages
Machine Learning Chapter 1
No ratings yet
Machine Learning Chapter 1
24 pages
Machine Learning
No ratings yet
Machine Learning
18 pages
Mathematics and Machine Learning
No ratings yet
Mathematics and Machine Learning
10 pages
Ml2 Script v2
No ratings yet
Ml2 Script v2
123 pages
I2ml Chap8 v1 1
No ratings yet
I2ml Chap8 v1 1
22 pages
UNIT - 2
No ratings yet
UNIT - 2
96 pages
Module 2_Deep_Learning_Fundamentals
No ratings yet
Module 2_Deep_Learning_Fundamentals
98 pages
Zhi-Hua Zhou - Machine Learning-Springer Singapore (2021)
100% (1)
Zhi-Hua Zhou - Machine Learning-Springer Singapore (2021)
578 pages
Mathematics For Machine Learning
100% (1)
Mathematics For Machine Learning
417 pages
Classic Machine Learning Algorithms
No ratings yet
Classic Machine Learning Algorithms
61 pages
Introduction: Geometric Models: - Page 1 of 25
No ratings yet
Introduction: Geometric Models: - Page 1 of 25
25 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
Introduction To Machine Learning: Gilles Gasso
No ratings yet
Introduction To Machine Learning: Gilles Gasso
32 pages
شباتر اله مجمعه
No ratings yet
شباتر اله مجمعه
126 pages
Unit 2 - Gaussian Models
No ratings yet
Unit 2 - Gaussian Models
67 pages
Chapter 4 ML Parametric Classification
No ratings yet
Chapter 4 ML Parametric Classification
42 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Social Network Analysis As A Method in The Data Journalistic Toolkit
No ratings yet
Social Network Analysis As A Method in The Data Journalistic Toolkit
61 pages
Week 7 - PROBABILITY OF UNION OF TWO EVENTS-Lesson Plan
No ratings yet
Week 7 - PROBABILITY OF UNION OF TWO EVENTS-Lesson Plan
6 pages
Definitions: Figure 1. The Normal Distribution Graph
No ratings yet
Definitions: Figure 1. The Normal Distribution Graph
8 pages
Receiver Operating Characteristic
No ratings yet
Receiver Operating Characteristic
19 pages
3BDS011224 Eng Function Designer
No ratings yet
3BDS011224 Eng Function Designer
618 pages
Circle MA
No ratings yet
Circle MA
32 pages
ALGEBRA MSTE
No ratings yet
ALGEBRA MSTE
4 pages
Reading 3 - Leveling
No ratings yet
Reading 3 - Leveling
27 pages
The Impact Of Pds Partnerships In Challenging Times Pixita Del Prado Hill Editor download
No ratings yet
The Impact Of Pds Partnerships In Challenging Times Pixita Del Prado Hill Editor download
48 pages
1610 ST5214
No ratings yet
1610 ST5214
4 pages
Homework 10: R Markdown
No ratings yet
Homework 10: R Markdown
39 pages
The Notation - OOAD
No ratings yet
The Notation - OOAD
133 pages
SS G515
No ratings yet
SS G515
4 pages
Lab 36
100% (1)
Lab 36
5 pages
Teran-Virto - Spectra09 - Preliminary Design of Low-Rise Buildings With BRBs
No ratings yet
Teran-Virto - Spectra09 - Preliminary Design of Low-Rise Buildings With BRBs
27 pages
Practice Exam 5-1: - 1.the Truss Shown Is: 5. The Truss Shown Is
No ratings yet
Practice Exam 5-1: - 1.the Truss Shown Is: 5. The Truss Shown Is
9 pages
Numeracy, Measure, Handling Data
No ratings yet
Numeracy, Measure, Handling Data
4 pages
Gas-Liquids Separators - Mark Bothamley - Part 2
No ratings yet
Gas-Liquids Separators - Mark Bothamley - Part 2
14 pages
Speed Distance Time WS
No ratings yet
Speed Distance Time WS
8 pages
Civil Service - Electrical Engineering Main Paper I & II - 1992 - 2007
No ratings yet
Civil Service - Electrical Engineering Main Paper I & II - 1992 - 2007
147 pages
Dumanayos Et. Al. FINALmanuscript
No ratings yet
Dumanayos Et. Al. FINALmanuscript
57 pages
2003 - Adjustment of Differential Liberation Data To Separator Conditions - Marhoun PDF
No ratings yet
2003 - Adjustment of Differential Liberation Data To Separator Conditions - Marhoun PDF
6 pages
Telecommunications Engineering: Ele5Tel 1
No ratings yet
Telecommunications Engineering: Ele5Tel 1
31 pages
summer hw 12 maths
No ratings yet
summer hw 12 maths
10 pages
F1-Mat-Pp1-Set 2-T2-2024-QS
No ratings yet
F1-Mat-Pp1-Set 2-T2-2024-QS
14 pages
IX - Annual Exam Blue Print
No ratings yet
IX - Annual Exam Blue Print
9 pages
Visual Aids in Reports and Presentations
50% (4)
Visual Aids in Reports and Presentations
40 pages
Types of Structural Load in Beams
No ratings yet
Types of Structural Load in Beams
10 pages