Differentials
Differentials are undoubtedly one of the most esoteric and least understood aspects of calcu-
lus. A differential is an infinitessimal change in a variable, which means the change is smaller
in magnitude than any nonzero real number. From the perspective of real numbers, the only
infinitessimal is 0, but a change of 0 in a variable is not particularly useful or exciting. Are
there any more useful infinitessimals out there? The answer is yes, but we need to look
beyond the real numbers to find them.
An alternate (but equally valid) way of developing the theory of calculus is using infinites-
simals, instead of limits. In fact, this is the way that the calculus was initially developed
by Newton and Leibniz (although it was not until many years that this notion was given a
rigorous mathematical base, which is one of the reasons we develop calculus using limits; the
other reason is that limits have many more applications other than just differentiation and
integration). In order to develop calculus in this way, we need to work with the hyperreal
number system, in which there are nonzero infinitessimals. In this way we can solve the
problem of looking at the rate of change of a function over an interval as the length of the
interval approaches 0 instead by consider the rate of change of a function over an interval of
infinitessimal length. Thus, when we write
df
f 0 (x) =
dx
we are saying that the derivative of f is the ratio of an infinitessimal change in f over an
infinitessimal change in x. We might also write
df f (x + dx) − f (x)
f 0 (x) = =
dx dx
where the final statement emphasizes explicitly what an infinitessimal change in f is. The
notation for integration also stems from the notion of infinitessimals, where we sum up rect-
angles of infinitessimal width. We will not go further into this subject, because it is out of
the scope of our studies. It is discussed merely because it is really where the intuition of
differentials stems from, and because it is interesting.
In terms of real numbers, when we think of a differential it is essentially 0 (it is arbitrarily
close to 0), but in some situations of using differentials we arrive at finite results. We can
arrive at a finite result in two cases
1. When we look at the ratio of two differentials, the ratio of infinitessimal changes may
no longer be infinitessimal.
2. When we sum an infinite number of infinitessimal quantities, the sum may no longer
be infinitessimal.
The above statements tell us that although infinitessimals are arbitrarily close to 0, they
are certainly not 0. First, we cannot divide by zero, and second, no matter how many
zeroes we add together, the result is still zero. Above we represented the differential df
as f (x + dx) − f (x), but we can represent it in another way as well (in this way we can
avoid the difficulties and subtleties of calculating f (x + dx) − f (x)). Assuming that f 0 (x)
exists, moving an infinitessimal distance along the curve f (x) is identical to moving an
infinitessimal distance along the tangent line to f at the point x. This follows from the fact
that a differentiable function is essentially linear in a small enough neighborhood around a
given point, so that by restricting that neighborhood to be small enough, we can make the
tangent line approximation as accurate as we like. Thus, for a differentiable function,
df
df = f 0 (x)dx = · dx
dx
Nearly any elementary calculus text will caution us not to simply cancel the differentials dx
and write off the above equation as trivial. Why shouldn’t we? Looking at a slightly more
complicated expression, the chain rule, we can find reason for this. The chain rule states
that, for differentiable functions f (u) and u(x), we have
df df du
= ·
dx du dx
One would think that this identity is trivial if we simply cancel the differentials du. Why
can’t we do this? Well let’s suppose that in general we could cancel the differentials. Now
let’s look at the functions
1 u∈Q
f (u) =
0 u∈ /Q
and
0 u∈Q
u(x) =
1 u∈
/Q
These in and of themselves seem to be very esoteric functions, so let’s take a minute to think
about what they really are. For the function f , whenever the input is a rational number, we
get an output of 1. Otherwise, for an irrational number, the output is 0. This function is so
badly broken up that we cannot draw it, but we can note that it is discontinuous everywhere
(because between any two rationals is an irrational, and between any two irrationals is a
rational). The second function u behaves similarly to f , except that it is 0 for rationals and
1 for irrationals. It is equally badly behaved. However, when we look at the composition
f (u(x)), something remarkable happens. No matter the input, u outputs a rational number
(either 0 or 1), so the output of f (u(x)) = 1 for all x. This function is continuous everywhere,
as well as differentiable (f ◦ u)0 (x) = 0 for all x. However, this function does not satisfy
the hypothesis of the chain rule, because neither of the component functions is differentiable
anywhere. Thus, writing that
df du df
· =
du dx dx
by canceling the differentials du would be a grave mistake, because neither of the functions
on the left-hand side is differentiable, so the product of two things which do not exist clearly
cannot be the derivative of another function, namely 0!
The above example may feel very artificial, and in some ways it is. But it does provide
us with the reason why there is so much caution about simply canceling the differentials,
because it is possible to conceive of circumstances where doing so would lead us astray. Now
that we know we cannot always cancel differentials as such, the natural question to ask is:
when can we cancel differentials? Well, we know that if f 0 (x) exists, then
df
df = f 0 (x)dx = dx
dx
which tells us that as long as f 0 (x) exists in the above situation, we can effectively cancel
the dx’s. Similarly, with the chain rule, if we know that f 0 (u) and u0 (x) exist (which implies
that (f ◦ u)0 (x) exists), then
df df du
= ·
dx du dx
df df du
· dx = · · dx
dx du dx
df df
· dx = · du = df
dx du
which says there are multiple ways of representing the differential df . The above follows
because
du df
· dx = du and · du = df
dx du
by definition. We will close by saying that in most cases where the derivatives all exist, the
differentials usually cancel. Nevertheless, we caution the reader to first be certain that the
differentials really cancel, by deriving the relationship from
df = f 0 (x)dx
We will finish with a few examples.
Example 1 Find dy if y(x) = x · cos(x).
Solution Since y is a differentiable function of x, we can use the above formula, which tells
us that
dy = (cos(x) − x sin(x))dy
p
Example 2 Find du if u(y) = 1 + y 2 .
Solution Once again we have a differentiable function u(y), so we find that
1 2y ydy
du = ·p dy = p
2 1+y 2 1 + y2