The Nature ofECONOMETRICS
Econometrics
and Economic Data
Module Code: INS1064
Number of credits: 4
Pre-requisite(s): Theory of probability and mathematical statistics (MAT1004)
Teaching Language: English
Lecturer Information:
No Name Title Institution Email Phone
1. Trần Quang Tuyến Ph.D VNU-IS tuyenisvnu@[Link]
0912474896
2. Lê Văn Đạo Master VNU-IS daoleisvnu@[Link]
0394952064
The Nature of Econometrics
No Assessment items
Assessment methods
Value Notes
1.
and Economic Data
-Regular Assessment
-Attendance and learning
20%
10% In-class and take-home exercises / assignments: good presenting
documents and writing.
-In-class and take-home 10%
exercises / assignments
2. -Midterm exam 20% Choose one among following three topics for the topic of the take
– home group assignment:
- Completing and writing a take
– home group assignment - Extension of linear regression model;
- Multiple linear regression model: Ordinary least squares method,
inference;
- Multiple linear regression model: Further issues on normality,
functionsl form and residual – multicollinearity analysis.
Good writing (not exceeding 20 pages ) and good power point
presenting (time for presentation not exceeding 15 minutes).
3. -Final exam 60% 2-hour written open - book exam
Total 100%
The Nature of Econometrics CONTENTS
Chapter 1. The nature and methodology of econometrics
and
ChapterEconomic
2. Simple linear regressionData
(SLR)
Chapter 3. Multiple linear regression (MLR)
Chapter 4. Multiple linear regression model: Inference & asymptotics
Chapter 5. Further issues with multiple linear regression
Chapter 6: Regression models with dummy (binary) variables
Chapter 7: More on specification and data issues
Presenting take-home group assignment on the 8th week
Chapter 8: Basic regression analysis with time series data
Chapter 9: Further issues on using ordinary least squares method with time series
data
Chapter 10: Serial correlation and heteroskedasticity in time series regressions
Chapter 11: Carrying out an empirical project
The Nature of Econometrics
and Chapter
Economic Data
1: Nature and methodology of econometrics
1.1 The definition and purposes of econometrics
1.2. Methodology of econometrics
1.4. Types of economic data
1.5. Causality and the notion of ceteris paribus in
econometric analysis
Study case: Statistical relationship vs deterministic relationship
The Nature ofandEconometrics
1.1. Definition purposes of econometrics
and Economic Data
Econometrics can be defined as the social science that applies economic
theory, mathematics, and statistics to quantify economic phenomena.
Common goals of econometric analysis
Testing economic theories and hypotheses
Investigating relationships between economic variables
Forecasting economic phenomena
Assessing government and business policies
Testing an economic theory: Testing an economic theory:
Human capital theory Supply and demand
EEstimating the relationships between economic variables
ÊEstimating the relationship between socio-economic variabless
FForcasting economic phenomina
EvEvaluating policies implemented by the government or firms
The Nature ofandEconometrics
1.1. Definition purposes of econometrics
and Economic Data
What is nonexperimental data?
Econometrics has developed into a separate discipline from
mathematical statistics, which typically analyzes nonexperimental
data.
Nonexperimental data is often called observational or retrospective
data. This indicates that researchers are passive collecters of the
data.
Experimental data is collected in laboratory experiments (mostly in
natural sciences).
In social sciences, it is often impossible to conduct experiments.
• Conducting experiments in the social sciences is either extremely expensive or unethical.
1.2. Methodology of Economices
Generally speaking, traditional econometric methodogy proceeds along following
steps
Step 1. Formulating research questions/hypotheses
Step 2. Specification of a suitable economic model
Step 3. Turning the economic model into an econometric model
Step 4. Obtaining the data
Step 5. Estimating the parameters of the econometric model
Step 6. Testing the hypotheses
Step 7. Forcasting/policy implications
Step 1: Formulating research questions/hypotheses
Examples:
Does job mismatch affect wage and job turnover?
Does greater household wealth make young children perform better?
Do mobile banner ads increase sales?
Step 2: Specifying a suitable economic model
This step is often skipped in empirical research
It may be micro or macromodels
Such models often base on optimizing behaviour or equilibrium
Some models establish relationships between economic variables: FDI & technology transfer; CSR & firm
performance,..
Step 2: Specifiying an economic model
Example 1:
The functional form was not specified (e.g., linear or non-linear)
The equation was proposed without a formal economic model
Step 2: Specifiying an economic model
Example 2:
Step 3: Turning the economic model into an econometric model
Step 3: turning the economic model into an econometric model
1.4. Types of economic data
Four types of economic data sets
Cross-sectional data
Time series data
Pooled cross sections
Panel/Longitudinal data
Note: The selection of econometric methods depends on the type/nature of the data used.
The specification of inappropriate methods may provide misleading results.
Table 1.1: Cross-sectional data set on households in Hoai Duc District
Age of household head=54
Observation number: the 5th household Consumption per capita=1106.67
thousand VND/month
Indicator variable (1=poor;0=non-poor)
Table 1.2. Cross sectional data on countries’ GDP and education
Cross-sectional data sets
Random sample of individuals, families, enterprises,
cities, regions, nations,or other units of interest at a
given point of time/in a given period
Cross-sectional observations are more or less
independent
The pure random sampling is violated:, e.g.
respondents refuse to respond, or if the cluster
sampling is conducted.
Cross-sectional data is mostly applied in applied
microeconomics
Table 1.3 Time series data set on trade and tourism in Vietnam
(billion VND)
Năm Tổng số Bán lẻ Dịch vụ lưu trú, ăn uống Dịch vụ và du lịch
1990 19031.2 16747.4 2283.8 .
1991 33403.6 29183.3 4220.3 .
1992 51214.5 44778.3 6436.2 .
1993 67273.3 58424.4 8848.9 .
1994 93490 74091 11656 7743
1995 121160 94863 16957 9340
1996 145874 117547 18950 9377
1997 161899.7 131770.4 20523.5 9605.8
1998 185598.1 153780.6 21587.7 10229.8
1999 200923.7 166989 21672.1 12262.6
2000 220410.6 183864.7 23506.2 13039.7
2001 245315 200011 30535 14769
2002 280884 221569.7 35783.8 23530.5
2003 333809.3 262832.6 39382.3 31594.4
2004 398524.5 314618 45654.4 38252.1
2005 480293.5 373879.4 58429.3 47984.8
2006 596207.1 463144.1 71314.9 61748.1
2007 746159.4 574814.4 90101.1 81243.9
2008 1007213.5 781957.1 113983.2 111273.2
2009 1405864.6 1116477 158847.9 130540.1
2010 1677344.7 1254200 212065.2 211079.5
2011 2079523.5 1535600 260325.9 283597.6
2012 2369130.6 1740360 305651 323119.9
2013 2615203.6 1964667 315873.2 334663.9
2014 2916233.9 2189448 353306.5 373479
2015 3223202.6 2403723 399841.8 419637.6
2016 3546268.6 2648857 439892.3 457519.6
2017 3956599.1 2967485 488615.6 500498.8
2018 4393525.5 3308059 534168.5 551298
2019 4892114.39 3694560 595936.91 601617.59
2020 4847645.3 3815079 479715.67 552850.58
2021 4657066.28 3830560 379390.64 447115.82
Time series data
Observations of single variable or multiple variables over time
For example, GDP, inflation, stock prices, annual exchange rates,
agriculture sales, …
Such kind of data is mostly serially correlated (observations
are often not independent over time)=> requires more
advanced econometric techniques.
Observation order contains important information
Frequency: Daily, weekly, monthly, quarterly, annualy, …
Typical characteristics: trends and seasonality
Typical applications: applied macroeconomics and finance
Pooled cross sections
A combination of more than one cross-sectional data in
one data set
Cross sections are sampled independently of each other
Such kind of data is often used for assessing policy
changes
Example:
• Measure the effect of change in Hanoi‘ s expansion on house
prices
• Random sample of house prices for the year 2007
• A new random sample of house prices for the year 2009
• Compare before/after (2007: before expansion, 2009: after
expansion)
Table 1.4: Pooled cross sections on housing prices ( Woolridge, 2014)
Before reform
After reform
Table 1.5: Two-year panel data on provincial development statistics
Panel or longitudinal data
Data contain the same cross-sectional observations are followed over time
Such kind of data consists of a cross-sectional and a time series
dimension
Panel data enables researchers to eliminate time-invariant unobservables
Panel data can be used for models with lagged dependent variables
Example: Factors affecting provinces‘ economic growth
• Data on each province is observed in two or more years
• Time-invariant unobserved province characteristics ( that may affect
economic growth) can be modeled and removed
• Effect of government police on growth may exhibit time lag
1.5. Causality and the notion of ceteris paribus
Ceteris paribus is a Latin phrase, showing an assumption that other (relevant)
factors being equal or held constant.
The notion of causal effect of X on Y :
“How does Y changes if X changed while all other factors are constant“.
Most economic questions are ceteris paribus questions
In analyzing consumer behaviour ( micro economics), we have the law of demand:
“All other factors held constant, the higher the unit price of a good, the fewer
the number of units demanded by consumers and, consequently, sold by
firms”Samuelson and Marks (2009).
The goal of econometric analysis is to infer that one variable (like education)
causes another (such as worker productivity). If other factors are not held
fixed, then we cannot know the causal effect of education on productivity.
How can an experiment be constructed to infer the causal effect?
Causal effect of fertilizer on crop yield
„By how much will the rice output increase if one increases the amount
of fertilizer applied to the field“
It must be assumed that all other factors that affect rice yield such
as land quality, temperature, rainfall, diseases, etc. are held fixed.
Experiment:
Select several one-acre plots of land; randomly assign different amounts
of fertilizer to the different plots and then compare the output.
In this case, the experiment works because the amount of fertilizer
applied is unrelated to other factors affecting rice yields.
In other word, the experiment helps isolate other factors than
fertilizer that affects rice yields.
Experiment and ethical issue
Causal effect of education on productivity
In order to estimate the causal effect of education on productivity, all other factors
that influence wages such as experience, innate ability, family background, etc. are
held fixed.
Problem without random assignment: nonexperimental or observational data
often suffers from the fact that education level is more likely to related to
unobservables, such as innate ability. People with higher abilities, for example, tend
to have higher levels of education.
An experiment can make sure that education is unrelated to other factors that
affect wages. E.g., choose a group of children, making sure that different levels of
education are randomly assigned to them. Finally, compare the wage outcomes.
Is this experiment unethical?
With non-experimental data, discovering causality is very challenging.
But it is infeasible to conduct an expriment due to ethical issues.
Exercises 1
Compare and contrast the economic and econometric models.
What is your comment about this statement? "An econometric model is
always derived from a formal economic model."
Why does observational data often not guarantee the assumption
"ceteris paribus"?
In a study on the effect of fertilizer on rice productivity, more fertilizer is
used in less fertile plots, but we do not have data on land fertility. If we
found a positive link between fertilizer and rice yields, would we have
convincingly concluded that fertilizer makes rice production more
productive?
Say you have to conduct an experiment on whether violent video games
cause school violence among students. Is it feasible? Why?
Can you infer a casual effect of violent video games on school violence if
your research based on observational data? explain why?
Name other factors other than violent video games that can affect
school violence. Name some factors that can be measurable and
unmeasurable.