0% found this document useful (0 votes)
21 views

Assignment 2

The document discusses solving a multi-variable optimization problem using Lagrange multipliers. It derives the system of equations for finding extrema along a constraint and shows the solution agrees with using fmincon in MATLAB. It also proves that differentiability of a function implies its rate of change can be expressed as the gradient plus a higher order term.

Uploaded by

Abhinav Pradeep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Assignment 2

The document discusses solving a multi-variable optimization problem using Lagrange multipliers. It derives the system of equations for finding extrema along a constraint and shows the solution agrees with using fmincon in MATLAB. It also proves that differentiability of a function implies its rate of change can be expressed as the gradient plus a higher order term.

Uploaded by

Abhinav Pradeep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Assignment 2

Abhinav Pradeep
Tutorial group 12
September 3, 2023

1 Question 1.
Note, all code was written in one file:

z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;
Min1 = fminsearch(z,[1,1]);
disp("Minimum at: (x,y)");
disp(Min1);
Min2 = fminsearch(z,[-1,1]);
disp("Minimum at: (x,y)");
disp(Min2);
Min3 = fminsearch(z,[-1,-1]);
disp("Minimum at: (x,y)");
disp(Min3);
Min4 = fminsearch(z,[2,-1]);
disp("Minimum at: (x,y)");
disp(Min4);

[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)

hold on
plot3(-3.779310,-3.283186,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(-2.805118,3.131312,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.584428,-1.848126,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.000000, 2.000000, 0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)

Provided answers below are snippets of this file.

1
1.1 a
z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;
Min1 = fminsearch(z,[1,1]);
disp("Minimum at: (x,y)");
disp(Min1);
Min2 = fminsearch(z,[-1,1]);
disp("Minimum at: (x,y)");
disp(Min2);
Min3 = fminsearch(z,[-1,-1]);
disp("Minimum at: (x,y)");
disp(Min3);
Min4 = fminsearch(z,[2,-1]);
disp("Minimum at: (x,y)");
disp(Min4);

1.2 b
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)

1.3 c
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)

hold on
plot3(-3.779310,-3.283186,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(-2.805118,3.131312,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.584428,-1.848126,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.000000, 2.000000, 0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
Console outputs:

Minimum at: (x,y)


3.0000 2.0000

Minimum at: (x,y)


-2.8051 3.1313

Minimum at: (x,y)


-3.7793 -3.2832

Minimum at: (x,y)


3.5844 -1.8481

2 Question 2.
z(x, y) = (x2 + y − 11)2 + (x + y 2 − 7)2

x+y =0
Let g(x, y) = x + y

Solutions of the below system of equations are the extrema along restriction x + y = 0:

∇z = λ∇g

x+y =0
Find ∇z:
 
δ 2 2 2 2
 δ 2 2 2 2

∇z = (x + y − 11) + (x + y − 7) , (x + y − 11) + (x + y − 7)
δx δy
 
δ 2 δ δ δ
∇z = (x + y − 11)2 + (x + y 2 − 7)2 , (x2 + y − 11)2 + (x + y 2 − 7)2
δx δx δy δy

∇z = 2(x2 + y − 11) · 2x + 2(x + y 2 − 7) · 1, 2(x2 + y − 11) · 1 + 2(x + y 2 − 7) · 2y




∇z = 4x(x2 + y − 11) + 2(x + y 2 − 7), 2(x2 + y − 11) + 4y(x + y 2 − 7)




Find ∇g:
 
δ δ
∇g = (x + y) , (x + y)
δx δy

∇g = (1, 1)
Therefore, the system of equations becomes
   
4x(x2 + y − 11) + 2(x + y 2 − 7) 1
2 2 =λ·
2(x + y − 11) + 4y(x + y − 7) 1

x+y =0
   
4x(x2 + y − 11) + 2(x + y 2 − 7) λ
2 2 =
2(x + y − 11) + 4y(x + y − 7) λ

x+y =0
Hence, the system of equations turns to:

4x(x2 + y − 11) + 2(x + y 2 − 7) = 2(x2 + y − 11) + 4y(x + y 2 − 7)

x+y =0
Plotting these relations on Desmos,
2.1 a, b, c done in one file:
z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;

[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
contour(X,Y,Z, ’LevelStep’, 5)

hold on
fimplicit(@(x,y) x+y, [-5 5 -5 5])

A = [1 1];
b = 0;
x0 = [1,1];
Min = fmincon(z,x0,A,b);
disp(Min);

plot(Min(1), Min(2), ’ro’, ’MarkerSize’, 10, ’MarkerFaceColor’, ’r’)

Console output:

Local minimum found that satisfies the constraints.

Optimization completed because the objective function is non-decreasing in


feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.

<stopping criteria details>


2.8548 -2.8548

This does agree with my understanding of Lagrange multipliers as the minimum calculated by Matlab
is a solution of the Lagrange multiplier system of equations derived earlier. This is show below:
It can be seen that the system of equations does yield a solution at x = 2.8548 and y = −2.8548. From
an intuition standpoint, it can be seen that the minimum is at a point where the constraint curve is
tangent to the contour line. Contour lines are level curves. The constraint curve is a level curve. For
any point P on a level curve f = 0, where f : Rn → R, it can be shown that ∇f (P ) ⊥ (f = 0)@P
(note @P is shorthand for : at point P ). Below is a proof of this:

Consider r(t) : R → Rn which satisfies.

f (r(t)) = 0
That is, level curve defined by f = 0 is parameterised by r(t)
d
f (r(t)) = 0
dt
d
∇f (r(t)) · r(t) = 0
dt
d
The above equation says that for any point P on r(t), as r(t)@P will yield a vector tangent to
dt
r(t)@P , the dot product of ∇f (P ) and r(t)@P will be 0. Hence, ∇f (P ) ⊥ r(t)@P

By the previous fact, when the two level curves in R2 are tangent to one another, their gradients are
parallel to one another. That is, for level curve z(x, y) = A and level constraint curve g(x, y) = 0
(where g(x, y) = x + y), if they are tangent at some point P , then at P :

∇z(P ) = λ∇g(P )
This is precisely the Lagrange multiplier condition. Hence this does agree with my conceptual under-
standing of Lagrange multipliers.
3 Question 3.
3.1 a
f : Rn → R is differentiable. That is:

f (a + k) − f (a) − ∇f (a) · k
lim =0
k→0 ∥k∥
Where k ∈ Rn

Prove:
∃ E(k) = (E1 (k), E2 (k), . . . , En (k))
Where Ei : Rn → R such that

f (a + k) − f (a) = (∇f (a) + E(k)) · k


and lim Ei (k) = 0 ∀i
k→0

f is differentiable ⇔ there exists a multivariate function s : Rn → R such that

and lim s(k) = 0


k→0
n
Consider s : R → R such that:

f (a + k) − f (a) = ∇f (a) · k + s(k)∥k∥


Such s can be defined as f is differentiable and therefore ∇f (a) exists. Moreover, differentiability
implies continuity and therefore, ∀a and ∀k f (a) and f (a + k) exist.

As all outputs are ∈ R the below algebra is valid:

f (a + k) − f (a) ∇f (a) · k
= + s(k)
∥k∥ ∥k∥
f (a + k) − f (a) ∇f (a) · k
− = s(k)
∥k∥ ∥k∥
f (a + k) − f (a) − ∇f (a) · k
= s(k)
∥k∥
By taking the lim
k→0

f (a + k) − f (a) − ∇f (a) · k
lim = lim s(k)
k→0 ∥k∥ k→0

The definition of differentiability requires that:

f (a + k) − f (a) − ∇f (a) · k
lim =0
k→0 ∥k∥
Hence lim s(k) = 0.
k→0

Therefore f is differentiable ⇔ ∃s : Rn → Rn such that:


f (a + k) − f (a) = ∇f (a) · k + s(k)∥k∥
And:

lim s(k) = 0
k→0

Consider the initial expression:

f (a + k) − f (a) = ∇f (a) · k + s(k)∥k∥


Define E : Rn → Rn

E(x) = (E1 (x), E2 (x), E3 (x), . . . En (x)) , Ei : Rn → R


Such that:

E(k) · k = s(k)∥k∥
The existence of s(k) ∀k guarantees the existence of E(k) · k ∀k, which then guarantees the existence
of E(k) ∀k? Hence,

f (a + k) − f (a) = ∇f (a) · k + E(k) · k


Consider,

E(k) · k = s(k)∥k∥
To satisfy this condition, set:

s(k)∥k∥
Ei (k) =
nki
 
s(k)∥k∥ s(k)∥k∥ s(k)∥k∥ s(k)∥k∥
E(k) = , , , ...
nk1 nk2 nk3 nkn
This would ensure that
n
X s(k)∥k∥
E(k) · k = · ki
i=1
nki
n
X s(k)∥k∥
E(k) · k =
i=1
n
n
X 1
E(k) · k = s(k)∥k∥
i=1
n

E(k) · k = s(k)∥k∥
Now consider that ∀ i ∈ [1, n], as ki is just one component of |k|,

∥ki ∥ ≤ ∥k∥
This implies that
−ki ≤ ∥k∥ ≤ ki

∥k∥
−1 ≤ ≤1
ki
−1 ∥k∥ 1
≤ ≤
n nki n
−s(k) s(k)∥k∥ s(k)
≤ ≤
n nki n
s(k)∥k∥
As Ei (k) = nki

−s(k) s(k)
≤ Ei (k) ≤
n n
Taking lim
k→0

−s(k) s(k)
lim ≤ lim Ei (k) ≤ lim
k→0 n k→0 k→0 n
As lim s(k) = 0,
k→0

0 ≤ lim Ei (k) ≤ 0
k→0

Hence by the squeeze theorem,

lim Ei (k) = 0
k→0

This applies ∀i ∈ [1, n]

3.2 b
Prove that:

lim Ei (x(t0 + h) − x(t0 )) = 0


h→0

As differentiability implies continuity,

lim x(t0 + h) − x(t0 ) = 0


h→0

Where (x(t0 + h) − x(t0 )) ∈ Rn (as x(t) ∈ Rn ) and 0 ∈ Rn is the n-dimensional zero vector.

Consider:

lim Ei (k) = 0
k→0

Is equivalent to saying that:

∀ϵ1 > 0 ∃δ1 > 0 such that if


0 < ∥k∥ < δ1
Then,

|Ei (k)| < ϵ1


Consider:

lim x(t0 + h) − x(t0 ) = 0


h→0

Is equivalent to saying that:

∀ϵ2 > 0 ∃δ2 > 0 such that if

0 < |h| < δ2


Then,

∥x(t0 + h) − x(t0 )∥ < ϵ2


Fix δ1 . Set δ2 such that if

0 < |h| < δ2

∥x(t0 + h) − x(t0 )∥ < δ1


As (x(t0 + h) − x(t0 )) ∈ Rn it is a valid input to Ei , ∥x(t0 + h) − x(t0 )∥ < δ1 implies that:

|Ei (x(t0 + h) − x(t0 ))| < ϵ1


As this applies for any δ1 > 0 and corresponding ϵ1 > 0, the above can be rewritten as:

∀ϵ > 0 ∃δ > 0 such that if

0 < |h| < δ


Then,

|Ei (x(t0 + h) − x(t0 ))| < ϵ


Hence,

lim Ei (x(t0 + h) − x(t0 )) = 0


h→0
4 Question 4.
4.1 a
∇x ||x − y||2 = 2(x − y)
 
∂ ∂ ∂ ∂
∇x = , , , ...
∂x1 ∂x2 ∂x3 ∂xn
n
X
||x − y||2 = (xi − yi )2
i=1
2 n
Therefore, ||x − y|| is R → R:

n n n n
!
∂ X ∂ X ∂ X ∂ X
∇x ||x − y||2 = (xi − yi )2 , (xi − yi )2 , (xi − yi )2 , ... (xi − yi )2
∂x1 i=1 ∂x2 i=1 ∂x3 i=1 ∂xn i=1

For j ∈ [1, n]
n
∂ X
(xi − yi )2 = 2(xj − yj )
∂xj i=1
As,
(
∂ 2(xi − yi )When i = j
(xi − yi )2 =
∂xj 0 When i ̸= j
Hence,

∇x ||x − y||2 = (2(x1 − y1 ), 2(x2 − y2 ), 2(x3 − y3 ), ...2(xn − yn ))


Which is equivalently

∇x ||x − y||2 = 2(x − y)

4.2 b
A:k×n

x:n×1

u = Ax
Hence, the dimensions of u:

u:k×1
n
 
X
 A1,i xi 
 i=1 
 n 
X 

 A 2,i x i


 i=1 
X n 
u=
 
 A 3,i x i


 i=1 
 .. 

 n . 

X 
Ak,i xi
 
i=1

g : Rk → R
Show that

∇x g(u) = AT ∇u g(u)

∇x : n × 1
 
A1,1 . . . Ak,1
AT =  ... .. .. 

. . 
A1,n . . . Ak,n

AT : n × k

∇u : k × 1

g(u) ∈ R

g(u) = g(u1 , u2 , u3 , . . . uk )
As
n
 
X
 A1,i xi 
 i=1 
 n 
 X 

 A2,i xi 

 i=1 
X n 
u= 

 A3,i xi 

 i=1 
 .. 
 n .
 

X 
Ak,i xi
 
i=1
n
X
uj = Aj,i xi
i=1

Hence

uj (x)
That is uj : Rn → R

g(u) = g(u1 (x), u2 (x), u3 (x), . . . uk (x))


Consider

∇x g(u)


 
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
 ∂x1 
 ∂ 
 g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
∂x
 
 2 
 ∂
∇x g(u) = 

 ∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 

 .. 
.
 
 
 ∂ 
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂xn
Consider

AT ∇u g(u)


 
g(u1 , u2 , u3 , . . . uk )
 ∂u1 
 ∂ 
 g(u1 , u2 , u3 , . . . u )
k 

 ∂u2


 ∂
∇u g(u) = 

 ∂u3 g(u1 , u2 , u3 , . . . uk ) 

 .. 
.
 
 
 ∂ 
g(u1 , u2 , u3 , . . . uk )
∂uk

 
g(u1 , u2 , u3 , . . . uk )
   ∂u1 
A1,1 . . . Ak,1  ∂  
 A1,2 . . . Ak,2   g(u1 , u2 , u3 , . . . uk )
  ∂u2

 
AT ∇u g(u) =  A1,3 . . . Ak,3  ·  ∂ g(u1 , u2 , u3 , . . . uk )
   
 ..   ∂u3 
 .   . 
 .
.

A1,n . . . Ak,n   ∂


g(u1 , u2 , u3 , . . . uk )
∂uk
 k

X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk ) 
∂ui

 
 i=1 
k

X 
Ai,2 · g(u1 , u2 , u3 , . . . uk ) 
 


 i=1 ∂ui 

k
AT ∇u g(u) =  X ∂
 
Ai,3 ·

 g(u1 , u2 , u3 , . . . uk ) 

 i=1 ∂ui 


 .
.. 

 
k

X 
Ai,n · g(u1 , u2 , u3 , . . . uk )
 
i=1
∂u i

Hence, to show:

∇x g(u) = AT ∇u g(u)
 k

X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk ) 
∂ ∂u
  
i
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
 
 i=1 
 ∂x1 k

 X 
 ∂
A · g(u , u , u , . . . u )
  
 g(u (x), u (x), u (x), . . . u (x))   i,2 1 2 3 k 
 ∂x2
 1 2 3 k  
  i=1 ∂ui 

 ∂ k
= ∂
 X 
 ∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x))   Ai,3 ·
   
g(u1 , u2 , u3 , . . . uk ) 
 ..  
  i=1 ∂ui 
.
 
   .. 
 ∂   . 
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
k

∂xn ∂
X 
Ai,n · g(u1 , u2 , u3 , . . . uk )
 
i=1
∂u i

That is show that for j ∈ [1, n]


k
∂ X ∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) = Ai,j · g(u1 , u2 , u3 , . . . uk )
∂xj i=1
∂ui
Consider

g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂xj
By the chain rule,
k
∂ X ∂ ∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) = ui (x) · g(u1 , u2 , u3 , . . . uk )
∂xj i=1
∂x j ∂u i

Consider

ui (x)
∂xj
Where,
n
 
X
 A1,i xi 
 i=1 
 n 
X 

 A 2,i x i


 i=1 
X n 
u=
 
 A 3,i x i


 i=1 
 .. 

 n . 

X 
Ak,i xi
 
i=1
n
X
ui = Ai,l xl
l=1
n
∂ X
Ai,l xl = Ai,j
∂xj l=1
As
(
∂ Ai,l When l = j
Ai,l xl =
∂xj 0 When l ̸= j
Therefore,

ui (x) = Ai,j
∂xj
Hence
k k
X ∂ ∂ X ∂
ui (x) · g(u1 , u2 , u3 , . . . uk ) = Ai,j · g(u1 , u2 , u3 , . . . uk )
i=1
∂x j ∂u i i=1
∂u i

k
∂ X ∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) = Ai,j · g(u1 , u2 , u3 , . . . uk )
∂xj i=1
∂ui
This holds for j ∈ [1, n]

Hence,
 k

X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk ) 
∂ ∂ui
  
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
 
 i=1 
 ∂x1 k

 X 
 ∂
Ai,2 · g(u1 , u2 , u3 , . . . uk ) 
  
 g(u1 (x), u2 (x), u3 (x), . . . uk (x))  
∂ui
 ∂x2
   
  i=1 
 ∂ k
= ∂
 X 

 ∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
 

Ai,3 ·

g(u1 , u2 , u3 , . . . uk ) 
 ..  
  i=1 ∂ui 
.
 

 ∂
 
  .
.. 

g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
k

∂xn ∂
X 
Ai,n · g(u1 , u2 , u3 , . . . uk )
 
i=1
∂u i

∇x g(u) = AT ∇u g(u)

4.3 c
(AT A) is invertible ⇔ A and AT are invertible (moreover A is invertible ⇔ AT is invertible). This
restriction forces k = n as A and AT must be invertible.

f (x) = ∥Ax − y∥2

Ax : Rn

y : Rn
Set u = Ax

f (x) = ∥u − y∥2

∇x f (x) = ∇x ∥u − y∥2
Set

g(u) = ∥u − y∥2

∇x f (x) = ∇x g(u)
Where g : Rn → R,
By 4.b,

∇x g(u) = AT ∇u g(u)

AT ∇u g(u) = AT ∇u ∥u − y∥2
By 4.a,
∇u ∥u − y∥2 = 2 (u − y)
Therefore,

AT ∇u g(u) = AT 2 (u − y)
As,

∇x g(u) = AT ∇u g(u)

∇x g(u) = AT 2 (u − y)
Substituting back u = Ax and ∇x f (x) = ∇x g(u),

∇x f (x) = AT 2 (Ax − y)
Critical points occur when ∇x f (x) = 0. Hence, critical points occur when:

AT 2 (Ax − y) = 0
As 2 ∈ R

AT (Ax − y) = 0

AT Ax − AT y = 0

AT Ax = AT y
As AT and A are invertible,

A−T AT Ax = A−T AT y

Ax = y

x = A−1 y

4.4 d
To show

∥Aw − y∥2 ≥ ∥Ax̂ − y∥2

Aw − y = Aw − Ax̂ + Ax̂ − y

∥Aw − y∥2 = ∥Aw − Ax̂ + Ax̂ − y∥2


n
X n
X
((Aw)i − yi )2 = ((Aw)i − (Ax̂)i + (Ax̂)i − yi )2
i=1 i=1
n
X n
X
2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i + (Ax̂)i − yi )2
i=1 i=1
n
X n
X
((Aw)i − yi )2 = (((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ))2
i=1 i=1

n
X n
X n
X n
X
2 2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ) + 2((Aw)i − (Ax̂)i )((Ax̂)i − yi )
i=1 i=1 i=1 i=1

As x̂ = A−1 y ⇒ Ax̂ = AA−1 y ⇒ Ax̂ = y ⇒ (Ax̂)i = yi

n
X n
X n
X n
X
2 2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ) + 2((Aw)i − (Ax̂)i )(yi − yi )
i=1 i=1 i=1 i=1

n
X n
X n
X
2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi )2
i=1 i=1 i=1
n
As ∀w ∈ R
n
X
((Aw)i − (Ax̂)i )2 ≥ 0
i=1
n
X n
X
2
((Aw)i − yi ) ≥ ((Ax̂)i − yi )2
i=1 i=1

∥Aw − y∥2 ≥ ∥Ax̂ − y∥2

5 Question 5.
5.1 a
Given that ∃g(x, y) such that:

g(x, y) = f (x, y)
∂y
And the mixed second partial derivatives are continuous.

By this continuity, conditions for Clairaut’s theorem are satisfied. Hence,

∂2 ∂2
g(x, y) = g(x, y)
∂y∂x ∂x∂y

As ∂y
g(x, y) = f (x, y)
∂2 ∂
g(x, y) = f (x, y)
∂y∂x ∂x
Integrate both sides over [a, b] with respect to y
Z b Z b
∂2 ∂
g(x, y)dy = f (x, y)dy
a ∂y∂x a ∂x
Z b   Z b
∂ ∂ ∂
g(x, y) dy = f (x, y)dy
a ∂y ∂x a ∂x

By the FTC,
Z b  
∂ ∂ ∂ ∂
g(x, y) dy = g(x, b) − g(x, a)
a ∂y ∂x ∂x ∂x
Hence,
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
a ∂x ∂x ∂x
Now consider the integral:
Z b
f (x, y)dy
a

As f (x, y) = ∂y
g(x, y)
Z b Z b

f (x, y)dy = g(x, y)dy
a a ∂y
By the FTC,
Z b

g(x, y)dy = g(x, b) − g(x, a)
a ∂y
Hence,
Z b
f (x, y)dy = g(x, b) − g(x, a)
a

Take ∂x
Z b
∂ ∂
f (x, y)dy = (g(x, b) − g(x, a))
∂x a ∂x
By linearity of the derivative,
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
∂x a ∂x ∂x
Rb
As a
f (x, y)dy will return a function of purely x, the partial can be switched out:
Z b
d ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
dx a ∂x ∂x
It was earlier derived that:
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
a ∂x ∂x ∂x
Therefore,
Z b Z b
d ∂
f (x, y)dy = f (x, y)dy
dx a a ∂x

5.2 b
Z π/2  
2 2 2 1+β
ln(cos x + β sin x) dx = π ln
0 2
Consider
Z π/2
f (β) = ln(cos2 x + β 2 sin2 x)dx
0
Z π/2
d ∂
f (β) = ln(cos2 x + β 2 sin2 x)dx
dβ 0 ∂β
π/2
2β sin2 (x)
Z
d
f (β) = dx
dβ 0 sin2 (x) β 2 + cos2 (x)
π/2
sin2 (x)
Z
d
f (β) = 2β dx
dβ 0 sin2 (x)β 2 + cos2 (x)
As

tan (x)
sin (x) =
sec (x)
1
cos (x) =
sec (x)
This can be rewritten as,
Z π/2 tan2 (x)
d sec2 (x)
f (β) = 2β 2
tan (x) 2
dx
dβ 0 2 β + 1
2
sec (x) sec (x)
1
d
Z π/2
sec2 (x)
tan2 (x)
f (β) = 2β 1 dx
dβ 0 sec2 (x)
(β 2 tan2 (x) + 1)
Z π/2
d sec2 (x) tan2 (x)
f (β) = 2β dx
dβ 0 sec2 (x) (β 2 tan2 (x) + 1)
As

sec2 (x) = tan2 (x) + 1


This can be rewritten as,
π/2
sec2 (x) tan2 (x)
Z
d
f (β) = 2β dx
dβ 0 (tan2 (x) + 1) (β 2 tan2 (x) + 1)
Now consider

sec2 (x) tan2 (x)


Z
dx
(tan2 (x) + 1) (β 2 tan2 (x) + 1)
Set

u = tan (x)
Hence,

du = sec2 (x) dx

du
= dx
sec2 (x)
Therefore,

u2
Z
= du
(u2 + 1) (β 2 u2 + 1)
Z  
1 1
= − du
(β 2 − 1) (u2 + 1) (β 2 − 1) (β 2 u2 + 1)
Z Z
1 1 1 1
du − du
β2 − 1 u2 + 1 β2 − 1 β 2 u2 + 1
Z Z
1 1 1 1
du − 2 du
2
β −1 2
u +1 (β − 1) β 2 u + β12
2

Z
1
2
du = arctan(u)
u +1
Z
1
du = β arctan(βu)
u + β12
2

1 1
= arctan(u) − 2 β arctan(βu)
β2 −1 (β − 1) β 2
arctan(u) arctan(βu)
= −
β2 − 1 (β 2 − 1) β
As u = tan(x)

arctan(tan(x)) arctan(β tan(x))


= −
β2 − 1 (β 2 − 1) β
x arctan(β tan(x))
= −
β2 −1 (β 2 − 1) β
Hence,
sec2 (x) tan2 (x)
Z
x arctan(β tan(x))
2 2 2
dx = 2 −
(tan (x) + 1) (β tan (x) + 1) β −1 (β 2 − 1) β
π/2
sec2 (x) tan2 (x)
Z
2βx 2β arctan(β tan(x))
2β 2 2 2
dx = 2 −
0 (tan (x) + 1) (β tan (x) + 1) β −1 (β 2 − 1) β
π/2
sec2 (x) tan2 (x)
Z
2βx 2 arctan(β tan(x))
2β 2 2
dx = −
0 (tan (x) + 1) (β 2 tan (x) + 1) β2 − 1 β2 − 1
π/2
sec2 (x) tan2 (x) 2βx − 2 arctan(β tan(x))
Z
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) β2 − 1
π/2
sec2 (x) tan2 (x) βπ − 2 arctan(β tan( π2 )) −2 arctan(β tan(0))
Z
2β dx = −
0 (tan2 (x) + 1) (β 2 tan2 (x) + 1) β2 − 1 β2 − 1
π/2
sec2 (x) tan2 (x) βπ − 2 arctan(β tan( π2 )) 2 arctan(β tan(0))
Z
2β dx = +
0 (tan2 (x) + 1) (β 2 tan2 (x) + 1) β2 − 1 β2 − 1

lim tan(x) = ∞
x→ π2

For some β ∈ R

lim β tan(x) = ∞
x→ π2

As,
π
lim arctan(x) =
x→∞ 2
Hence,
π
lim arctan(β tan(x)) =
x→ π2 2
π/2
sec2 (x) tan2 (x) βπ − π 2 arctan(0)
Z
2β 2 2 2
dx = 2 +
0 (tan (x) + 1) (β tan (x) + 1) β −1 β2 − 1
π/2
sec2 (x) tan2 (x) βπ − π 2·0
Z
2β 2 2 2
dx = 2 + 2
0 (tan (x) + 1) (β tan (x) + 1) β −1 β −1
Z π/2
sec2 (x) tan2 (x) βπ − π
2β 2 2 2
dx = 2
0 (tan (x) + 1) (β tan (x) + 1) β −1
Z π/2
sec2 (x) tan2 (x) π (β − 1)
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) β2 − 1
Using difference of squares,
π/2
sec2 (x) tan2 (x) π (β − 1)
Z
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) (β − 1)(β + 1)
π/2
sec2 (x) tan2 (x)
Z
π
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) (β + 1)
Hence,
d π
f (β) =
dβ (β + 1)
Z
π
f (β) = dβ
(β + 1)

f (β) = π ln(|β + 1|) + C


As β > 0,

f (β) = π ln(β + 1) + C
Consider f (1),
Z π/2
f (β) = ln(cos2 x + β 2 sin2 x)dx
0
Z π/2
f (1) = ln(cos2 x + sin2 x)dx
0
By the pythagorean identity
Z π/2
f (1) = ln(1)dx
0
Z π/2
f (1) = 0dx
0

f (1) = 0
Hence,

f (1) = π ln(1 + 1) + C

0 = π ln(2) + C

C = −π ln(2)
Therefore,

f (β) = π ln(β + 1) + C

f (β) = π ln(β + 1) − π ln(2)

f (β) = π(ln(β + 1) − ln(2))


 
β+1
f (β) = π ln
2
Hence it has been shown that
Z π/2  
2 2 2 1+β
ln(cos x + β sin x) dx = π ln
0 2

You might also like