Week14_Applications
Week14_Applications
Week 14
Application: Approximation & optimization
Application: Solving a simple regression problem
This Week’s Content
• Application: Approximation & optimization
• Approximating Functions with Taylor Series
• Finding the Roots of a Function
• Application: Solving a simple regression problem
Approximating Functions with
Taylor Series
• In some engineering or scientific problems, we have limited
access to a function
• We might be provided only the value of the function or its
derivatives at certain input values and we might be required to
make estimations (approximations) about the values of the
function at other input values.
• Taylor’s series is one method that provides us a very elegant
solution for approximating a function.
Approximating Functions with
Taylor Series
• The Taylor series of a function 𝑓(⋅) is essentially an infinite sum of
terms that converges to the value of the function at 𝑥, i.e. 𝑓(𝑥). In
more formal terms:
• Note that the denominator (𝑛!) gets large values quickly when 𝑛 gets
large.
• Therefore, higher-order terms in the Taylor series are likely to be very
small values.
• For this reason, taking the first 𝑚 terms in the series do provide sufficient
approximations to 𝑓(𝑥) in practice.
Taylor Series Example in Python
• Let us take a “simple” function, namely 𝑓(𝑥)=𝑥3, and approximate it with a few
terms of the Taylor Series expansion:
Finding the Roots of a Function
• Let us assume that we have a non-linear, continuous function 𝑓(𝑥) that
intersects the horizontal axis as in Fig. 11.2.1.
• We are interested in finding 𝑟0,...,𝑟𝑛 such that 𝑓(𝑟𝑖)=0. We call these
intersections (𝑟0,𝑟1,𝑟2 in Fig. 11.2.1) the roots of function 𝑓().
• For many problems in Engineering and Mathematics, finding the values
of these intersections is very useful.
• For example, many problems requiring solutions to equations
like 𝑔(𝑥)=ℎ(𝑥) can be reformulated as a root finding problem
for 𝑔(𝑥)−ℎ(𝑥)=0.
Newton’s Method for Finding the Roots
• Newton’s method is one of the many methods for finding the roots of a function.
• It is a very common method that randomly starts at an 𝑥0 value and iteratively
takes one step at a time towards a root.
• The method can be described with the following iterative steps:
• Step 1:
• Set an iteration variable, 𝑖, to 0. Initialize 𝑥𝑖 randomly; however, if you have a good
guess, it is always better to initialize 𝑥𝑖 with it.
• For our example function 𝑓(𝑥) see the selected 𝑥𝑖 in Fig. below.
Newton’s Method for Finding the Roots
• Step 2:
• Find the intersection of the tangent line at 𝑥𝑖 with the horizontal axis.
• The tangent line to the function at 𝑥𝑖 is illustrated as the green dashed line in the
Figure below.
• This line can be easily derived using 𝑥𝑖 and 𝑓′(𝑥𝑖).
Newton’s Method for Finding the Roots
• Let us use 𝑥𝑖+1 to denote the intersection of the tangent line with the horizontal
axis.
• Using the definition of the derivative and simple geometric rules, we can easily
show that 𝑥𝑖+1 satisfies the following:
• At first, this might look difficult since we are trying to estimate a function from
its input-output values and there can be infinitely many functions that can go
through a finite set of (𝑥𝑖,𝑦𝑖) values.
• However, by restructuring the problem a little bit and making some assumptions
on the general form of 𝑓(), we can solve such complicated regression problems
very well.
An Application: Solving a Simple
Regression Problem
• Let us briefly describe how we can do so: We first re-write 𝑓() as a parametric function
and try to formulate the regression problem as the problem of finding the parameters
of 𝑓():
• where 𝜃=(𝜃1,...,𝜃𝑛) is the set of parameters that we need to find to solve the
regression problem;
• and 𝑔1(),...,𝑔𝑚() are our “simpler” functions (compared to 𝑓()) whose parameters can
be identified more easily compared to 𝑓().
• With this parameterized definition of 𝑓(), we can formally define regression as an
optimization (minimization) problem:
• where the summation runs over the (𝑥0,𝑦0),(𝑥1,𝑦1),...,(𝑥𝑛,𝑦𝑛) values that we are trying
to regress;
• (𝑦𝑖−𝑓(𝑥𝑖;𝜃))2 is the error between the estimated (regressed) value (i.e. 𝑓(𝑥𝑖;𝜃)) and
the correct 𝑦𝑖 value with 𝜃 as the parameters.
An Application: Solving a Simple
Regression Problem
• We can describe this optimization formulation in words as:
• Find parameters 𝜃 that minimize the error between 𝑦𝑖 and what the function is
predicting given 𝑥𝑖 as input.
• This is a minimization problem and SciPy can be used for such problems.
• We will spend more time on this with sample regression problems and
solutions. Namely,
• Linear regression, where 𝑓() is assumed to be a line, 𝑦=𝑎𝑥+𝑏, and 𝜃 is (𝑎,𝑏).
• Non-linear regression, where 𝑓() has a non-linear nature, e.g. a quadratic,
exponential or logarithmic form.
Why is regression important?
• In many disciplines, we observe an environment, an event, or an entity through
sensors or some data collected in a different manner, and obtain some
information about what we are observing in the form of a variable 𝑥.
• Each 𝑥 generally has an associated outcome variable 𝑦.
• Or 𝑥 can represent an action that we are exerting in an environment and we are
interested in finding how 𝑥 affects the environment by observing a variable 𝑦.
• Here are some examples for 𝑥 and 𝑦 for which finding a function 𝑓() can be really
useful:
• 𝑥: The current that we are providing to a motor. 𝑦: The speed of the motor.
• 𝑥: The day of a year. 𝑦: The average rain fall on a year.
• 𝑥: The changes in stock value of a bond in the last day. 𝑦: The value of a bond
in one hour.
• 𝑥: The number of COVID cases in a day. 𝑦: The number of deaths owing to
COVID.
The form of the function
• Solving the regression problem
without making any assumption
about the underlying function is very
challenging.
• We can simplify the problem by
making an assumption about the
function.
• However, an incorrect assumption
can yield inaccurate regression
model.
• As illustrated, if the form of the
function is too simple (e.g. linear), the
fitted function will not be able to
capture the true nature of the data.
• However, if the assumed function form
is too complex (e.g. a higher-order
polynomial), the fitted function will try
to go through every data point, which
may not be ideal since the data can be
noisy.
Least-Squares Regression
• A commonly-used method for regression is called Least-Squares Regression
(LSE).
• In LSE, the minimization problem in Section 12.1 is solved by setting the
gradient of the objective to zero: