0% found this document useful (0 votes)

44 views74 pages

Advanced Machine Learning

The document discusses function approximation, which is approximating an unknown function f(x) using a simpler function. It presents methods for approximating a function on an interval using polynomials of varying degrees. Specifically, it shows how to construct a unique second-degree interpolation polynomial that matches the value of f(x) at three given points. More generally, it describes approximating a function using a weighted sum of simpler basis functions. The goal is for the approximation to be arbitrarily close to the true function over the domain.

Uploaded by

parth2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views74 pages

Advanced Machine Learning

Uploaded by

parth2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

Approximation of Functions

Ravi Kothari, Ph.D.

ravi.kothari@ashoka.edu.in

I think that it is a relatively good approximation to truth which is much too complicated
to allow anything but approximations that mathematical ideas originate in empirics. -
John von Neumann

Advanced Machine Learning 1 / 20

Approximation of Functions

Advanced Machine Learning 2 / 20

Approximation of Functions

Let y = f (x) be given on the interval [x0 , x2 ]. Let x1 be a point such

that x0 < x1 < x2

Advanced Machine Learning 2 / 20

Approximation of Functions

Let y = f (x) be given on the interval [x0 , x2 ]. Let x1 be a point such

that x0 < x1 < x2
y0 = f (x0 ), y1 = f (x1 ), y2 = f (x2 )

Advanced Machine Learning 2 / 20

Approximation of Functions

Let y = f (x) be given on the interval [x0 , x2 ]. Let x1 be a point such

that x0 < x1 < x2
y0 = f (x0 ), y1 = f (x1 ), y2 = f (x2 )
Let us say we want to approximate f (x). A polynomial of second
degree seems appropriate to approximate f (x) i.e.,
P(x) = a0 + a1 x + a2 x 2 (1)

Advanced Machine Learning 2 / 20

Approximation of Functions

Let y = f (x) be given on the interval [x0 , x2 ]. Let x1 be a point such

We want to nd the coecients of P(x) such that P(x0 ) = y0 ,

P(x1 ) = y1 , and P(x2 ) = y2

Advanced Machine Learning 2 / 20

Approximation of Functions

Let y = f (x) be given on the interval [x0 , x2 ]. Let x1 be a point such

We want to nd the coecients of P(x) such that P(x0 ) = y0 ,

P(x1 ) = y1 , and P(x2 ) = y2
We can of course solve it exactly since there are 3 equations and 3
unknowns

Advanced Machine Learning 2 / 20

Advanced Machine Learning 3 / 20
Let us approach it dierently and construct a polynomial Q0 (x) of
second degree such that Q0 (x0 ) = 1, Q0 (x1 ) = 0, Q0 (x2 ) = 0.
Likewise Q1 (x0 ) = 0, Q1 (x1 ) = 1, Q1 (x2 ) = 0, and Q2 (x0 ) = 0,
Q2 (x1 ) = 0, and Q2 (x2 ) = 1

Advanced Machine Learning 3 / 20

Let us approach it dierently and construct a polynomial Q0 (x) of
second degree such that Q0 (x0 ) = 1, Q0 (x1 ) = 0, Q0 (x2 ) = 0.
Likewise Q1 (x0 ) = 0, Q1 (x1 ) = 1, Q1 (x2 ) = 0, and Q2 (x0 ) = 0,
Q2 (x1 ) = 0, and Q2 (x2 ) = 1
The desired polynomials are,
(x − x1 )(x − x2 )
Q0 (x) =
(x0 − x1 )(x0 − x2 )
(x − x0 )(x − x2 )
Q1 (x) =
(x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
Q2 (x) = (2)
(x2 − x0 )(x2 − x1 )

Advanced Machine Learning 3 / 20

The interpolating polynomial is then,

P(x) = y0 Q0 (x) + y1 Q1 (x) + y2 Q2 (x) (3)

Advanced Machine Learning 3 / 20

Advanced Machine Learning 4 / 20
This interpolation polynomial of degree 2 is unique

Advanced Machine Learning 4 / 20

This interpolation polynomial of degree 2 is unique
Proof:

Advanced Machine Learning 4 / 20

This interpolation polynomial of degree 2 is unique
Proof:
I Let P1 (x) be another polynomial. Then, P(x) = P1 (x) for x = x0 ,
x = x1 , x = x2

Advanced Machine Learning 4 / 20

This interpolation polynomial of degree 2 is unique
Proof:
I Let P1 (x) be another polynomial. Then, P(x) = P1 (x) for x = x0 ,
x = x1 , x = x2
I So, P(x) − P1 (x) vanishes at three values of x

Advanced Machine Learning 4 / 20

This interpolation polynomial of degree 2 is unique
Proof:
I Let P1 (x) be another polynomial. Then, P(x) = P1 (x) for x = x0 ,
x = x1 , x = x2
I So, P(x) − P1 (x) vanishes at three values of x
I Hence P(x) − P1 (x) must be 0 i.e. P(x) = P1 (x)

Advanced Machine Learning 4 / 20

Advanced Machine Learning 5 / 20
In a general case,
n
X (x − x0 )(x − x1 ) . . . (x − xk−1 )(x − xk+1 ) . . . (x − xn )
Pn (x) = f (xk )
(xk − x0 )(xk − x1 ) . . . (xk − xk−1 )(xk − xk+1 ) . . . (xk − xn )
k=0
(4)

Advanced Machine Learning 5 / 20

In a general case,
n
X (x − x0 )(x − x1 ) . . . (x − xk−1 )(x − xk+1 ) . . . (x − xn )
Pn (x) = f (xk )
(xk − x0 )(xk − x1 ) . . . (xk − xk−1 )(xk − xk+1 ) . . . (xk − xn )
k=0
(4)

Weierstrass proved that polynomials can approximate arbitrary well

any continuous real function in an interval

Advanced Machine Learning 5 / 20

Weierstrass proved that polynomials can approximate arbitrary well

any continuous real function in an interval
Equation (6) corresponds to approximation of f (x) using a
superposition of simpler functions

Advanced Machine Learning 5 / 20

Generalizing

Advanced Machine Learning 6 / 20

Generalizing
Let f (x) be a real function of a real valued vector x = [x1 x2 . . . , xn ]T
that is square integrable over the real numbers

Advanced Machine Learning 6 / 20

Generalizing
Let f (x) be a real function of a real valued vector x = [x1 x2 . . . , xn ]T
that is square integrable over the real numbers
The goal of function approximation is to describe the behavior of f (x)
in a compact area S of the input space using a superposition of
simpler functions φi (x, w ) i.e.,
ñ
(5)
X
fˆ(φ(x, w ), W ) = Wi φi (x, w )
i=1

where, Wi 's are real valued constants such that,

|f (x) − fˆ(φ(x, w ), W )| < (6)
and can be arbitrarily small

Advanced Machine Learning 6 / 20

where, Wi 's are real valued constants such that,

|f (x) − fˆ(φ(x, w ), W )| < (6)
and can be arbitrarily small
So we obtain the value of x ∈ S based on the combination of simpler
or elementary functions {φi (x, w )}
Advanced Machine Learning 6 / 20
Advanced Machine Learning 7 / 20
Advanced Machine Learning 7 / 20
Generalizing

Advanced Machine Learning 8 / 20

Generalizing
There are many possible choices for {φi (x, w )}. The polynomial we
saw before is one possibility

Advanced Machine Learning 8 / 20

Generalizing
There are many possible choices for {φi (x, w )}. The polynomial we
saw before is one possibility
We would prefer a set of {φi (x, w )} over another set if it provides a
smaller error for a given h or is computationally more ecient
As a side note, observe that if we make the number of inputs equal to
the number of elementary functions {φi (x, w )}ñi=1 , then,
φ1 (x (1) , w ) φ2 (x (1) , w ) . . . φñ (x (1) , w )
     (1) 
W1 f (x )
φ1 (x (2) , w ) φ2 (x (2) , w ) . . . φñ (x (2) , w ) W2  f (x (2) )
   =  
 ...  . . .  . . . 
φ1 (x (h) , w ) φ2 (x (h) , w ) . . . φñ (x (h) , w ) Wñ f (x (ñ) )
(7)

Advanced Machine Learning 8 / 20

Advanced Machine Learning 9 / 20

Geometric Interpretation

Equation (5) describes a projection of f (x) in to a set of basis

functions {φi (x, w )}. The basis functions dene a manifold and
fˆ(φ(x, w ), W ) is the image or projection of f (x) in this manifold

Advanced Machine Learning 9 / 20

Geometric Interpretation

Equation (5) describes a projection of f (x) in to a set of basis

functions {φi (x, w )}. The basis functions dene a manifold and
fˆ(φ(x, w ), W ) is the image or projection of f (x) in this manifold