0% found this document useful (0 votes)
34 views18 pages

EMF Nonlinear

This document discusses incorporating non-linear transformations into linear regression models. It provides examples of taking log transformations of variables to recover linear relationships between variables. Specifically, it shows how taking the log of skewed variables like GDP can result in models that better satisfy the assumptions of linear regression. It also discusses how coefficients from log-level and log-log models have intuitive percentage interpretations.

Uploaded by

Nizar Habibi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views18 pages

EMF Nonlinear

This document discusses incorporating non-linear transformations into linear regression models. It provides examples of taking log transformations of variables to recover linear relationships between variables. Specifically, it shows how taking the log of skewed variables like GDP can result in models that better satisfy the assumptions of linear regression. It also discusses how coefficients from log-level and log-log models have intuitive percentage interpretations.

Uploaded by

Nizar Habibi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Non-Linear Transformations in Linear Models

Empirical Methods for Finance

Prof. Robert Hill

Nova SBE

First Semester T1
2022-2023

Robert Hill Empirical Methods for Finance 1 / 18


Incorporating Non-Linearities into the Linear Regression Model
Linearity means that the model is linear in the parameters
Non-linear transformations of the original variables are permitted
The model
log (y ) = β0 + β1 log (x) + u
is still linear in the parameters!
Defining ỹ = log (y ), x̃ = log (x): ỹ = β0 + β1 x̃ + u
Similarly, also
y = β0 + β1 1/x + u
y = β0 + β1 x + β2 x 2 + u
are linear in the parameters
A nonlinear model is e.g.
1
y= +u
β0 + β1 x

Robert Hill Empirical Methods for Finance 2 / 18


Log Transformations
When a nonlinear relationship exists between dependent and
independent variables, we may recover the desired linearity in the
parameters using log transformations of the variables
The nonlinear relationship of the un-logged variables is of course
preserved
Consider the model:
y = β0 x β1 exp(u)

If we take the log on both sides we end up with a linear model in the
parameters
log (y ) = β0 + β1 log (x) + u
which we can estimate on our data using OLS

Robert Hill Empirical Methods for Finance 3 / 18


Example I/VII
Consider the regression of % urban population on per capita GNP 1

We estimate the model by OLS (STATA certainly won’t stop you)

An increase of per capita GNP by 1000$ is associated to a 4


percentage points higher proportion of urban population
But ... are we estimating the right model? We are assuming a linear
relationship between urban and gnp

1
You can reproduce the results in STATA loading the data from
use https://stats.idre.ucla.edu/stat/stata/examples/sws5/nations
Robert Hill Empirical Methods for Finance 4 / 18
Example II/VII

Robert Hill Empirical Methods for Finance 5 / 18


Example III/VII

The distribution of per capita GNP is badly skewed, creating a


non-linear relationship between y and x
To mitigate the skew, we transform per capita GNP by taking the log
The distribution of log GNP looks much more symmetric

Robert Hill Empirical Methods for Finance 6 / 18


Example IV/VII

We estimate the model


urban = β0 + β1 log (gnp) + u

The assumption of lineary now appears to hold


Robert Hill Empirical Methods for Finance 7 / 18
Example V/VII

How do we interpret the coefficient?

Robert Hill Empirical Methods for Finance 8 / 18


Example VI/VII
How do we interpret the coefficient?
y = β0 + β1 log (x) + u

If ∆u = 0
∆y = β1 ∆log (x)

Using that2
∆log (x) = log (x1 ) − log (x0 ) ≈ (x1 − x0 )/x0 = ∆x/x0 , for small ∆x

100 × ∆log (x) ≈ %∆x

We get that
β1 β1
∆y = × 100 × ∆log (x) ≈ %∆x
100 100
2
Define z ≡ x1 /x0 − 1. Using that log (1 + z) ≈ z for small z yields
/x0 −Hill
x1Robert 1 = z ≈ log (1 + Empirical
z) = logMethods
(x1 /x0for log (x1 ) − log (x0 ) = ∆log (x1 )
) =Finance 9 / 18
Example VII/VII

1% higher GNP is associated with a 0.147 percentage points higher


proportion of urban population
1000$ increase means a 40% growth in per capita GNP for Algeria but
only 7% for Canada
That the same proportionate (i.e., percentage) increase in GNP is
associated to a constant increase in urban population is a better
assumption than a constant relationship in levels

Robert Hill Empirical Methods for Finance 10 / 18


Let’s continue with out percentage vs levels intuition...
Consider now a traditional wage-education regression
wage = β0 + β1 educ + u

This formulation assumes change in wages is constant for all


educational levels
E.g., increasing education from 5 to 6 years leads to the same $
increase in wages as increasing education from 11 to 12, or 15 to 16,
etc.
A much better assumption is that each year of education leads to a
constant percentage increase in wages
This intuition can be approximately captured by
log (wage) = β0 + β1 educ + u

How is that?
Robert Hill Empirical Methods for Finance 11 / 18
Log-level model

log (y ) = β0 + β1 x + u

If ∆u = 0
∆log (y ) = β1 ∆x

Knowing that
100 × ∆log (y ) ≈ %∆y

We get that
%∆y ≈ β1 × 100 × ∆x

For ∆x = 1, y increases by 100β1 %


In the wage example, every additional year of education increases wage
by 100β1 %
Robert Hill Empirical Methods for Finance 12 / 18
Log-Log (constant elasticity) model

log (y ) = β0 + β1 log (x) + u

In this case, β1 already has a percentage interpretation

%∆y ≈ β1 %∆x

For each 1% increase in x, y increases by β1 %


β1 is the elasticity of y with respect to x

Robert Hill Empirical Methods for Finance 13 / 18


Overview

Robert Hill Empirical Methods for Finance 14 / 18


Usefullness of logs
Logs lead to coefficients with appealing interpretations
If y > 0, log can mitigate (eliminate) skew and heteroskedasticity (we
will get back to this)
Logs of y or x can mitigate the influence of outliers by narrowing range
“Rules of thumb” of when to take logs:
▶ positive currency amounts (income, gdp, firm sales, market cap, etc.)
▶ variables with large integral values (population, school enrollment, total
number of employees, etc.)
and when not to take logs:
▶ variables measured in years, months, etc. (education, experience,
tenure, age, etc.)
▶ proportions

Robert Hill Empirical Methods for Finance 15 / 18


Limitations of logs
A limitation of logs is that it cannot be used if a variable takes on zero
or negative values
▶ If a variable y is non-negative but may be equal to zero, log (1 + y ) is
sometimes used
▶ Interpreting the estimates as if log (y ) were used is acceptable if the
data contains relatively few zeros

Robert Hill Empirical Methods for Finance 16 / 18


Remarks
For our purposes, only the natural logerithm is important
Both log (x) and ln(x) may be used for natural logs in books, paper
and these slides
In STATA, gen newvar = log(var) and gen newvar = ln(var)
are equivalent and both give the natural log
The exponential function is the inverse of the natural log function,
hence
log (y ) = β0 + β1 x
is equivalent to
y = exp(β0 + β1 x)

Robert Hill Empirical Methods for Finance 17 / 18


Quadratic Terms
We may not believe even the relationship between log wage and
education to be linear
The return on one more year of education may be different at high or
low levels of education
(Probably also discountinuities at years when you get a diploma)
Multiple regression allows us to include higher order terms (still linear!)
logwage = β0 + β1 educ + β2 educ 2 + u

β1 is no longer the ceteris paribus effect of educ on log wage


Makes no sense to say that β1 is the effect of x1 keeping x2 fixed...
Rather, the marginal effect of education depends on β1 as well as on
β2 and the level of education:

∆logwage/∆educ = β1 + 2β2 educ

Positive β1 and β2 imply that wage is increasing in education and that


the % increments are larger when education is higher
Robert Hill Empirical Methods for Finance 18 / 18

You might also like