0% found this document useful (0 votes)

7 views

Chapter2 Forecasting

Chapter 2 covers basic forecasting tools, focusing on time series data and its representation in R using 'ts' objects. It discusses various graphical methods for visualizing time series data, including trend and seasonal patterns, and introduces several forecasting methods and evaluation techniques. The chapter also emphasizes the importance of understanding the underlying patterns in data for accurate forecasting.

Uploaded by

tramphannguyenhuyen21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Chapter2 Forecasting

Uploaded by

tramphannguyenhuyen21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 94

Chapter 2: Basic forecasting

tools
Instructor: Truong Buu Chau
Email: [email protected]

Faculty of Mathematics and Statistics

November 6, 2018

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 1 / 70

Contents

1 Graphics
2 Numerical summaries
3 Some simple forecasting methods
4 Box-Cox transformations
5 Residual diagnostics
6 Evaluating forecast accuracy
7 Prediction intervals

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 2 / 70

1. Graphics
Time series in R: ’ts’ objects and ’ts’ function
A time series is stored in a ’ts’ object in R:
A list of numbers
Information about times those numbers were recorded
Example
Year Observation
2012 123
2013 39
2014 78
2015 52
2016 110

y <- ts(c(123,39,78,52,110), start=2012)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 3 / 70
1. Graphics

For observations that are more frequent than once per year,
add a ’frequency’ argument.
Ex: monthly data stored as a numerical vector ’z’:

y <- ts(z, frequency=12, start=c(2003, 1))

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 4 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual
Quarterly
Monthly
Daily
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1
Quarterly
Monthly
Daily
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly
Monthly
Daily
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4
Monthly
Daily
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly
Daily
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12
Daily
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily 7 or 365.25
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily 7 or 365.25 1 or c(1995,234)
Weekly
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily 7 or 365.25 1 or c(1995,234)
Weekly 52.18
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily 7 or 365.25 1 or c(1995,234)
Weekly 52.18 c(1995,23)
Hourly
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily 7 or 365.25 1 or c(1995,234)
Weekly 52.18 c(1995,23)
Hourly 24 or 168 or 8,766
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily 7 or 365.25 1 or c(1995,234)
Weekly 52.18 c(1995,23)
Hourly 24 or 168 or 8,766 1
Half-hourly

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example
Annual 1 1995
Quarterly 4 c(1995,2)
Monthly 12 c(1995,9)
Daily 7 or 365.25 1 or c(1995,234)
Weekly 52.18 c(1995,23)
Hourly 24 or 168 or 8,766 1
Half-hourly 48 or 336 or 17,532

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

ts objects and ts function

ts(data, frequency, start)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 5 / 70

Australian GDP
ausgdp <- ts(x, frequency=4, start=c(1971,3))

Class: “ts”
Print and plotting methods available.
ausgdp

## Qtr1 Qtr2 Qtr3 Qtr4

## 1971 4612 4651
## 1972 4645 4615 4645 4722
## 1973 4780 4830 4887 4933
## 1974 4921 4875 4867 4905
## 1975 4938 4934 4942 4979
## 1976 5028 5079 5112 5127
## 1977 5130 5101 5072 5069
## 1978 5100 5166 5244 5312
## 1979 5349 5370 5388 5396
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 6 / 70
1. Graphics
Australian GDP

autoplot(ausgdp)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 7 / 70

1. Graphics

> library(fpp2)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 8 / 70

1. Graphics

> library(fpp2)

This loads:
some data for use in examples and exercises
forecast package (for forecasting functions)
ggplot2 package (for graphics functions)
fma package (for lots of time series data)
expsmooth package (for more time series data)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 8 / 70

1. Graphics
For time series data, the obvious graph to start with is a time
plot. That is, the observations are plotted against the time of
observation, with consecutive observations joined by straight
lines.

autoplot(melsyd[,"Economy.Class"])

Weekly economy passenger load on Ansett Airlines.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 9 / 70
1. Graphics
The time plot immediately reveals some interesting features.
There was a period in 1989 when no passengers were carried
- this was due to an industrial dispute.
There was a period of reduced load in 1992. This was due
to a trial in which some economy class seats were replaced
by business class seats.
A large increase in passenger load occurred in the second
half of 1991.
There are some large dips in load around the start of each
year. These are due to holiday effects.
There is a long-term fluctuation in the level of the series
which increases during 1987, decreases in 1989, and
increases again through 1990 and 1991.
There are some periods of missing observations.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 10 / 70
1. Graphics

Your turn
Create plots of the following time series: dole,
bricksq, lynx, goog
Use help() to find out about the data in each series.
For the last plot, modify the axis labels and title.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 11 / 70

1. Graphics

Time series patterns

A trend exists when there is a long-term increase or
decrease in the data. It does not have to be linear.
Sometimes we will refer to a trend as "changing
direction", when it might go from an increasing trend to
a decreasing trend.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 12 / 70

1. Graphics

Time series patterns

A seasonal pattern occurs when a time series is
affected by seasonal factors (e.g., the quarter of the
year, the month, or day of the week).
Seasonality is always of a fixed and known frequency.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 13 / 70

1. Graphics
Trend

autoplot(a10) + ylab("$ million") + xlab("Year") +

ggtitle("Antidiabetic drug sales")

Monthly sales of antidiabetic drugs in Australia.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 14 / 70
1. Graphics
The monthly sales of antidiabetic drugs above shows
seasonality which is induced partly by the change in the
cost of the drugs at the end of the calendar year.
There is a clear and increasing trend.
There is also a strong seasonal pattern that increases in
size as the level of the series increases.
The sudden drop at the start of each year is caused by
a government subsidisation scheme that makes it
cost-effective for patients to stockpile drugs at the end
of the calendar year.
Any forecasts of this series would need to capture the
seasonal pattern, and the fact that the trend is
changing slowly.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 15 / 70
1. Graphics
Seasonal plots

ggseasonplot(a10, year.labels=TRUE,
year.labels.left=TRUE) + ylab("$ million") +
ggtitle("Seasonal plot: antidiabetic drug sales")

Seasonal plot of monthly antidiabetic drug sales in Australia.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 16 / 70
1. Graphics

Data plotted against the individual "seasons" in which

the data were observed. (In this case a "season" is a
month.)
Something like a time plot except that the data from
each season are overlapped.
Enables the underlying seasonal pattern to be seen more
clearly, and also allows any substantial departures from
the seasonal pattern to be easily identified.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 17 / 70

1. Graphics
Seasonal polar plots

ggseasonplot(a10, polar=TRUE) + ylab("$ million")

Polar seasonal plot of monthly antidiabetic drug sales in

Australia.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 18 / 70
1. Graphics
Seasonal subseries plots

ggsubseriesplot(a10) + ylab("$ million")+

ggtitle("Subseries plot: antidiabetic drug sales")

Seasonal subseries plot of monthly antidiabetic drug sales in

Australia.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 19 / 70
1. Graphics

Data for each season collected together in time plot as

separate time series.
Enables the underlying seasonal pattern to be seen
clearly, and changes in seasonality over time to be
visualized.
The horizontal lines indicate the means for each month.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 20 / 70

1. Graphics
Quarterly Australian Beer Production

beer <- window(ausbeer,start=1992)

autoplot(beer)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 21 / 70

1. Graphics
Quarterly Australian Beer Production

ggseasonplot(beer,year.labels=TRUE)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 22 / 70

1. Graphics
Quarterly Australian Beer Production

ggsubseriesplot(beer)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 23 / 70

1. Graphics

Your turn
The arrivals data set comprises quarterly international
arrivals (in thousands) to Australia from Japan, New
Zealand, UK and the US.

Use autoplot() and ggseasonplot() to compare the

differences between the arrivals from these four
countries.
Can you identify any unusual observations?

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 24 / 70

1. Graphics

Time series patterns

A cycle occurs when the data exhibit rises and falls
that are not of a fixed frequency.
These fluctuations are usually due to economic
conditions, and are often related to the "business cycle".
The duration of these fluctuations is usually at least 2
years.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 25 / 70

1. Graphics
Time series patterns: Differences between seasonal and
cyclic patterns:
seasonal pattern constant length; cyclic pattern variable
length
average length of cycle longer than length of seasonal
pattern
magnitude of cycle more variable than magnitude of
seasonal pattern

The timing of peaks and troughs is predictable with

seasonal data, but unpredictable in the long term with
cyclic data.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 26 / 70

1. Graphics

autoplot(ustreas) +
ggtitle("US Treasury Bill Contracts") +
xlab("Day") + ylab("price")

US Treasury Bill Contracts

90
price

0 20 40 60 80 100
Day

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 27 / 70

1. Graphics
Scatterplots

autoplot(elecdemand[,c("Demand","Temperature")], facets=TRUE)
ylab(" ") + xlab("Year: 2014") +
ggtitle("Half-hourly electricity demand: Victoria, Australia")

The relationship between demand and temperature.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 28 / 70
1. Graphics
Scatterplots
The graphs are useful for visualising individual time series. It is also useful
to explore relationships between time series.

qplot(Temperature, Demand, data=as.data.frame(elecdemand)) +

ylab("Demand (GW)") + xlab("Temperature (Celsius)")

The relationship between demand and temperature.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 29 / 70
1. Graphics

Your turn
Can you spot any seasonality, cyclicity and trend? What do
you learn about the series?

hsales
usdeaths
bricksq
sunspotarea
gasoline

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 30 / 70

1. Graphics

autoplot(hsales) +
ggtitle("Sales of new one-family houses, USA")
+ xlab("Year") + ylab("Total sales")

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 31 / 70

1. Graphics

autoplot(bricksq) +
ggtitle("Australian clay brick production") +
xlab("Year") + ylab("million units")

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 32 / 70

2. Numerical summaries
Numerical summaries (summary statistics) are a collection
of measures that try to describe as much as possible about
the data set in as few numbers as possible. A summary
number for a data set is called a statistic.
Univariate statistics
Mean (Average)
Median
Variance
Standard deviation
Percentiles
Interquartile range (IQR)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 33 / 70

2. Numerical summaries
Bivariate statistics

Covariance
1 X n
CovXY = (Xi − X̄ )(Yi − Ȳ ) (1)
n − 1 i=1
Correlation
Pn
CovXY i=1 (Xi − X̄ )(Yi − Ȳ )
rXY = = qP qP (2)
SX SY n
i=1 (Xi − X̄ )
2 n
i=1 (Yi − Ȳ )
2

The range of rXY to the interval −1 to +1.

Covariance and correlation: measure extent of linear relationship

between two variables (Y and X ).
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 34 / 70
2. Numerical summaries
Autocovariance and autocorrelation: measure linear
relationship between lagged values k of a time series yt .
1 T
X
ck = (yt − ȳ )(yt−k − ȳ ) (3)
T t=k+1
PT
t=k+1 (yt − ȳ )(yt−k − ȳ )
rk = PT 2
(4)
t=1 (yt − ȳ )
where T is the length of the time series.
We measure the relationship between:
yt and yt−1
yt and yt−2
yt and yt−3
etc.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 35 / 70
2. Numerical summaries
Autocorrelation
Example: Beer production
Results for first 9 lags for beer data:
r1 r2 r3 r4 r5 r6 r7 r8 r9
-0.102 -0.657 -0.060 0.869 -0.089 -0.635 -0.054 0.832 -0.108

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 36 / 70

2. Numerical summaries

Autocorrelation
Example: Beer production
r4 higher than for the other lags. This is due to the
seasonal pattern in the data: the peaks tend to be 4
quarters apart and the troughs tend to be 2 quarters
apart.
r2 is more negative than for the other lags because
troughs tend to be 2 quarters behind peaks.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 37 / 70

2. Numerical summaries
White noise
Time series that show no autocorrelation are called white noise

wn <- ts(rnorm(36))
autoplot(wn)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 38 / 70

2. Numerical summaries

White noise Series: wn

r1 -0.23 0.2

r2 0.04
ACF

0.0

r3 -0.20
r4 0.03 −0.2

r5 0.19 1 2 3 4 5 6 7 8
Lag
9 10 11 12 13 14 15

r6 -0.01
r7 -0.23
r8 0.07
r9 0.01
r10 -0.01
We expect each autocorrelation to be close to zero.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 39 / 70

2. Numerical summaries
White noise
Sampling distribution of rk for white noise data is
asymptotically N(0,1/T )
√
95% of all rk for white noise must lie within ±1.96/ T .
If this is not the case, the series is probably not white
noise.
√
Common to plot lines at ±1.96/ T when plotting
ACF. These are the critical values.
All autocorrelation coefficients lie within these limits,
confirming that the data are white noise. (More
precisely, the data cannot be distinguished from white
noise.)
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 40 / 70
2. Numerical summaries

Your turn
You can compute the daily changes in the Google stock
price using

dgoog <- diff(goog)

Does ’dgoog’ look like white noise?

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 41 / 70

3. Some simple forecasting methods
How would you forecast these data?

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 42 / 70

3. Some simple forecasting methods
How would you forecast these data?

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 43 / 70

3. Some simple forecasting methods
How would you forecast these data?

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 44 / 70

3. Some simple forecasting methods

Average method
Forecast of all future values is equal to mean of historical
data {y1 , . . . , yT }.
Forecasts: ŷT +h|T = ȳ = (y1 + · · · + yT )/T

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 45 / 70

3. Some simple forecasting methods

Average method
Forecast of all future values is equal to mean of historical
data {y1 , . . . , yT }.
Forecasts: ŷT +h|T = ȳ = (y1 + · · · + yT )/T

Naïve method
Forecasts equal to last observed value.
Forecasts: ŷT +h|T = yT .
Consequence of efficient market hypothesis.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 45 / 70

3. Some simple forecasting methods
Seasonal naïve method
Forecasts equal to last value from same season.
Forecasts: ŷT +h|T = yT +h−m(k+1) , where m = seasonal
period and k is the integer part of (h − 1)/m.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 46 / 70

Drift method
Forecasts equal to last value plus average change.
Forecasts: T
h X
ŷT +h|T = yT + (yt − yt−1 )
T − 1 t=2
h
= yT + (yT − y1 ).
T −1
Equivalent to extrapolating a line drawn between first and
last observations.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 46 / 70
3. Some simple forecasting methods
Forecasts for quarterly beer production

500

Forecast
Megalitres

Mean

450 Naive
Seasonal naive

400

1995 2000 2005 2010

Year

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 47 / 70

3. Some simple forecasting methods

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 48 / 70

3. Some simple forecasting methods

Mean: meanf(y, h=20)

Naïve: naive(y, h=20)
Seasonal naïve: snaive(y, h=20)
Drift: rwf(y, drift=TRUE, h=20)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 49 / 70

3. Some simple forecasting methods

Mean: meanf(y, h=20)

Naïve: naive(y, h=20)
Seasonal naïve: snaive(y, h=20)
Drift: rwf(y, drift=TRUE, h=20)

Your turn

Use these four functions to produce forecasts for goog

and auscafe.
Plot the results using autoplot().

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 49 / 70

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 50 / 70

4. Box-cox-transformations
Variance stabilization
If the data show different variation at different levels of the
series, then a transformation can be useful. Denote original
observations as y1 , . . . , yn and transformed observations as
w1 , . . . , wn .
Mathematical transformations for stabilizing variation
√
Square root wt = yt ↓
√
Cube root wt = 3 yt Increasing
Logarithm wt = log(yt ) strength

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 50 / 70

Logarithms, in particular, are useful because they are more interpretable: changes
in a log value are relative (percent) changes on the original scale.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 50 / 70
4. Box-cox-transformations

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 51 / 70

4. Box-cox-transformations

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 52 / 70

4. Box-cox-transformations

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 53 / 70

4. Box-cox-transformations

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 54 / 70

4. Box-cox-transformations

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 55 / 70

4. Box-cox-transformations

Each of these transformations is close to a member of the

family of Box-Cox transformations:

 log(yt ), λ = 0;
wt = 
(ytλ − 1)/λ, λ 6= 0.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 56 / 70

4. Box-cox-transformations

Each of these transformations is close to a member of the

family of Box-Cox transformations:

 log(yt ), λ = 0;
wt = 
(ytλ − 1)/λ, λ 6= 0.

λ = 1: (No substantive transformation)

λ = 12 : (Square root plus linear transformation)
λ = 0: (Natural logarithm)
λ = −1: (Inverse plus 1)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 56 / 70

4. Box-cox-transformations

Back-transformation
We must reverse the transformation (or back-transform) to
obtain forecasts on the original scale. The reverse Box-Cox
transformations are given by

 exp(wt ), λ = 0;
yt = 
(λWt + 1)1/λ , λ 6= 0.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 57 / 70

5. Residual diagnostics
Fitted values
ŷt|t−1 is the forecast of yt based on observations
y1 , . . . , yt .
We call these “fitted values”.
Sometimes drop the subscript: ŷt ≡ ŷt|t−1 .
Often not true forecasts since parameters are estimated
on all data.
For example:
ŷt = ȳ for average method.
ŷt = yt−1 + (yT − y1 )/(T − 1) for drift method.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 58 / 70

5. Residual diagnostics
Forecasting residuals
Residuals in forecasting: difference between observed
value and its fitted value: et = yt − ŷt|t−1 .

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 59 / 70

5. Residual diagnostics
Forecasting residuals
Residuals in forecasting: difference between observed
value and its fitted value: et = yt − ŷt|t−1 .

Assumptions
1 {et } uncorrelated. If they aren’t, then information left in
residuals that should be used in computing forecasts.
2 {et } have mean zero. If they don’t, then forecasts are
biased.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 59 / 70

5. Residual diagnostics
Forecasting residuals
Residuals in forecasting: difference between observed
value and its fitted value: et = yt − ŷt|t−1 .

Assumptions
1 {et } uncorrelated. If they aren’t, then information left in
residuals that should be used in computing forecasts.
2 {et } have mean zero. If they don’t, then forecasts are
biased.
Useful properties (for prediction intervals)
3 {et } have constant variance.
4 {et } are normally distributed.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 59 / 70
5. Residual diagnostics

ACF of residuals
We assume that the residuals are white noise
(uncorrelated, mean zero, constant variance). If they
aren’t, then there is information left in the residuals
that should be used in computing forecasts.
So a standard residual diagnostic is to check the ACF of
the residuals of a forecasting method.
We expect these to look like white noise.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 60 / 70

6. Evaluating forecast accuracy
Training and test sets

A model which fits the training data well will not necessarily
forecast well.
A perfect fit can always be obtained by using a model with
enough parameters.
Over-fitting a model to data is just as bad as failing to
identify a systematic pattern in the data.
The test set must not be used for any aspect of model
development or calculation of forecasts.
Forecast accuracy is based only on the test set.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 61 / 70
6. Evaluating forecast accuracy

Forecast "error": the difference between an observed value

and its forecast.

eT +h = yT +h − ŷT +h|T ,

where the training data is given by {y1 , . . . , yT }

Unlike residuals, forecast errors on the test set involve

multi-step forecasts.
These are true forecast errors as the test data is not
used in computing ŷT +h|T .

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 62 / 70

6. Evaluating forecast accuracy

Measures of forecast accuracy

Forecasts for quarterly beer production

500

Forecast Method
Megalitres

Mean
450 Naive
SeasonalNaive

400

1995 2000 2005 2010

Year

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 63 / 70

6. Evaluating forecast accuracy
Measures of forecast accuracy
yT +h = (T + h)th observation, h = 1, . . . , H
ŷT +h|T = its forecast based on data up to time T .
eT +h = yT +h − ŷT +h|T

MAE = mean(|eT +h |)
MSE = mean(eT2 +h )
q
RMSE = mean(eT2 +h )
MAPE = 100mean(|eT +h |/|yT +h |)

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 64 / 70

6. Evaluating forecast accuracy
Measures of forecast accuracy
yT +h = (T + h)th observation, h = 1, . . . , H
ŷT +h|T = its forecast based on data up to time T .
eT +h = yT +h − ŷT +h|T

MAE = mean(|eT +h |)
MSE = mean(eT2 +h )
q
RMSE = mean(eT2 +h )
MAPE = 100mean(|eT +h |/|yT +h |)

MAE, MSE, RMSE are all scale dependent.

MAPE is scale independent but is only sensible if yt 0 for all t, and y has
a natural zero.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 64 / 70
6. Evaluating forecast accuracy
Measures of forecast accuracy
Mean Absolute Scaled Error

MASE = mean(|eT +h |/Q)

where Q is a stable measure of the scale of the time series
{yt }.

Proposed by Hyndman and Koehler (IJF, 2006).

For non-seasonal time series,
T
−1 X
Q = (T − 1) |yt − yt−1 |
t=2
works well. Then MASE is equivalent to MAE relative to a
naïve method.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 65 / 70
6. Evaluating forecast accuracy
Measures of forecast accuracy
Mean Absolute Scaled Error

MASE = mean(|eT +h |/Q)

where Q is a stable measure of the scale of the time series
{yt }.

Proposed by Hyndman and Koehler (IJF, 2006).

For seasonal time series,
T
−1 X
Q = (T − m) |yt − yt−m |
t=m+1
works well. Then MASE is equivalent to MAE relative to a
seasonal naïve method.
Truong Buu Chau C03043 - Chapter 2 November 6, 2018 66 / 70
6. Evaluating forecast accuracy
Measures of forecast accuracy

beer2 <- window(ausbeer, start=1992, end=c(2007,4))

beer3 <- window(ausbeer, start=2008)
beerfit1 <- meanf(beer2, h=10)
beerfit2 <- rwf(beer2, h=10)
beerfit3 <- snaive(beer2, h=10)
accuracy(beerfit1, beer3)
accuracy(beerfit2, beer3)
accuracy(beerfit3, beer3)

RMSE MAE MAPE MASE

Mean method 38.45 34.83 8.28 2.44
Naïve method 62.69 57.40 14.18 4.01
Seasonal naïve method 14.31 13.40 3.17 0.94

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 67 / 70

6. Evaluating forecast accuracy

Are the following statements true or false?

1 Good forecast methods should have normally distributed
residuals.
2 A model with small residuals will give good forecasts.
3 The best measure of forecast accuracy is MAPE.
4 If your model doesn’t forecast well, you should make it
more complicated.
5 Always choose the model with the best forecast
accuracy as measured on the test set.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 68 / 70

7. Prediction intervals

A forecast ŷT +h|T is (usually) the mean of the

conditional distribution yT +h | y1 , . . . , yT .
A prediction interval gives a region within which we
expect yT +h to lie with a specified probability.
Assuming forecast errors are normally distributed, then
a 95% PI is
ŷT +h|T ± 1.96σ̂h

where σ̂h is the st dev of the h-step distribution.

When h = 1, σ̂h can be estimated from the residuals.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 69 / 70

7. Prediction intervals

Assume residuals are normal, uncorrelated, sd = σ̂:

q
Mean forecasts: σ̂h = σ̂ 1 + 1/T
√
Naïve forecasts: σ̂h = σ̂ h
√
Seasonal naïve forecasts σ̂h = σ̂ k + 1
q
Drift forecasts: σ̂h = σ̂ h(1 + h/T ).

where k is the integer part of (h − 1)/m.

Note that when h = 1 and T is large, these all give the
same approximate value σ̂.

Truong Buu Chau C03043 - Chapter 2 November 6, 2018 70 / 70

Science Form1 Chapter 1
88% (17)
Science Form1 Chapter 1
56 pages
4540 17 PDF
No ratings yet
4540 17 PDF
274 pages
Itilv3 Sample Question
No ratings yet
Itilv3 Sample Question
49 pages
2-tsgraphics
No ratings yet
2-tsgraphics
85 pages
2 Tsgraphics
No ratings yet
2 Tsgraphics
73 pages
Introduction To Rs Time Series Facilities
No ratings yet
Introduction To Rs Time Series Facilities
31 pages
2 - The Forecaster's Toolbox-ClassNotes
No ratings yet
2 - The Forecaster's Toolbox-ClassNotes
25 pages
Demgn801 Business Analytics 76 150
No ratings yet
Demgn801 Business Analytics 76 150
75 pages
ARIMA Models - Part 1: 8.0 - Introduction
No ratings yet
ARIMA Models - Part 1: 8.0 - Introduction
20 pages
By Chris Chatfield, Published in 2004 by Chapman & Hall/CRC in The Texts in Statistical Science Series
No ratings yet
By Chris Chatfield, Published in 2004 by Chapman & Hall/CRC in The Texts in Statistical Science Series
19 pages
Timeseries - Analysis
No ratings yet
Timeseries - Analysis
37 pages
Ps1sol 6218
No ratings yet
Ps1sol 6218
25 pages
9 Arima
No ratings yet
9 Arima
200 pages
FM - Resumes
No ratings yet
FM - Resumes
18 pages
09 - Forecasting
No ratings yet
09 - Forecasting
76 pages
tsa - Time Series Analysis
No ratings yet
tsa - Time Series Analysis
45 pages
Stochastics Processes R-Studio
No ratings yet
Stochastics Processes R-Studio
23 pages
Practical no_3
No ratings yet
Practical no_3
2 pages
Module 2.3 EDA Part 3 Time Series Data in Python and R
No ratings yet
Module 2.3 EDA Part 3 Time Series Data in Python and R
20 pages
EXAM1 - Muhibbul Arman Mannan: List Ls
No ratings yet
EXAM1 - Muhibbul Arman Mannan: List Ls
13 pages
Time Series
100% (1)
Time Series
61 pages
Lesson Slides - 4A Time Series Data and Their Graphs - Edrolo
No ratings yet
Lesson Slides - 4A Time Series Data and Their Graphs - Edrolo
34 pages
ClassII_2020
No ratings yet
ClassII_2020
26 pages
Demand Analysis
100% (1)
Demand Analysis
53 pages
Project Time Series Analysis
100% (2)
Project Time Series Analysis
26 pages
End Term Project (BA)
No ratings yet
End Term Project (BA)
19 pages
Chapter_1
No ratings yet
Chapter_1
28 pages
Chapter 1 - Lecture Notes
No ratings yet
Chapter 1 - Lecture Notes
16 pages
9 Arima
No ratings yet
9 Arima
199 pages
Chapter 1 - Lecture
No ratings yet
Chapter 1 - Lecture
11 pages
3 Seasonality PDF
No ratings yet
3 Seasonality PDF
58 pages
Time Series Analysis (TSA) - Tutorial
No ratings yet
Time Series Analysis (TSA) - Tutorial
136 pages
Time Series Analysis in R A Beginner's Guide
No ratings yet
Time Series Analysis in R A Beginner's Guide
13 pages
Chap2. Data Patterns
No ratings yet
Chap2. Data Patterns
84 pages
Time Series Analysis
No ratings yet
Time Series Analysis
12 pages
Project6 Time Series
No ratings yet
Project6 Time Series
14 pages
Topic 8 Time Series and Forecasting (1)
No ratings yet
Topic 8 Time Series and Forecasting (1)
33 pages
Analisis de Series de Tiempo en R PDF
100% (1)
Analisis de Series de Tiempo en R PDF
20 pages
Research2 q4 Slem4 W2 Attachment-Timeseriesforecastingpatternidentification
No ratings yet
Research2 q4 Slem4 W2 Attachment-Timeseriesforecastingpatternidentification
28 pages
Topic 8 Time Series and Forecasting
No ratings yet
Topic 8 Time Series and Forecasting
33 pages
PROG8520 - Week 9 - Slides
No ratings yet
PROG8520 - Week 9 - Slides
43 pages
Time Series
No ratings yet
Time Series
1 page
DLBDSTSA01_Course_Book_time_series_analysis
No ratings yet
DLBDSTSA01_Course_Book_time_series_analysis
244 pages
Time Series Prediction - California Dairy Data 1995-2013
No ratings yet
Time Series Prediction - California Dairy Data 1995-2013
30 pages
Chapter 1 - Lecture Slides
No ratings yet
Chapter 1 - Lecture Slides
53 pages
S6 - Time - Series Analysis - 1
No ratings yet
S6 - Time - Series Analysis - 1
21 pages
BA2_5_time_series
No ratings yet
BA2_5_time_series
91 pages
Time Series Project
No ratings yet
Time Series Project
19 pages
Gas Production
No ratings yet
Gas Production
29 pages
01 ASAP GM TimeSeriesForcasting - Day1 - 2 - Introduction
No ratings yet
01 ASAP GM TimeSeriesForcasting - Day1 - 2 - Introduction
66 pages
Unit 2b TS Decomposition
No ratings yet
Unit 2b TS Decomposition
44 pages
Chapter2.Data Pattern and Techniques Selection
No ratings yet
Chapter2.Data Pattern and Techniques Selection
77 pages
TS Gas Report
No ratings yet
TS Gas Report
43 pages
Code File
No ratings yet
Code File
4 pages
Useful Timeseries R
No ratings yet
Useful Timeseries R
38 pages
Business Forecasting
No ratings yet
Business Forecasting
85 pages
Exploratory Data Analysis Using R 1st edition by Ronald Pearson 9780429847042 0429847041 pdf download
100% (1)
Exploratory Data Analysis Using R 1st edition by Ronald Pearson 9780429847042 0429847041 pdf download
24 pages
21 - Practice Note On Time Series USING R
No ratings yet
21 - Practice Note On Time Series USING R
17 pages
Time Series 1
No ratings yet
Time Series 1
23 pages
STAT 5383 - Lab 1: Exploratory Tools For Time Series Analysis
No ratings yet
STAT 5383 - Lab 1: Exploratory Tools For Time Series Analysis
7 pages
Time Sereis in R
No ratings yet
Time Sereis in R
6 pages
Six Sigma Green Belt, Round 2: Making Your Next Project Better than the Last One
From Everand
Six Sigma Green Belt, Round 2: Making Your Next Project Better than the Last One
Tracy L. Owens
No ratings yet
Analysis of Urban Slum: Case Study of Korail Slum, Dhaka: November 2020
No ratings yet
Analysis of Urban Slum: Case Study of Korail Slum, Dhaka: November 2020
16 pages
Analisis About Meeting Country or Coorporation in 1 Years. Dosen: Annisa Wardhani, S.ST, MT
No ratings yet
Analisis About Meeting Country or Coorporation in 1 Years. Dosen: Annisa Wardhani, S.ST, MT
4 pages
Knowledge Base Article: 000517728:: Dsa-2018-018: Dell Emc Isilon Onefs Multiple Vulnerabilities. (000517728)
No ratings yet
Knowledge Base Article: 000517728:: Dsa-2018-018: Dell Emc Isilon Onefs Multiple Vulnerabilities. (000517728)
3 pages
Tce Exam SNV S3
No ratings yet
Tce Exam SNV S3
9 pages
Preparing and Making Scientific Presentation
No ratings yet
Preparing and Making Scientific Presentation
42 pages
Fa 3
No ratings yet
Fa 3
3 pages
5 Points 4 Points Peer Edit With Perfection! Tutorial Peer Edit With Perfection! Worksheet Answer Key Peer Edit With Perfection! Handout
No ratings yet
5 Points 4 Points Peer Edit With Perfection! Tutorial Peer Edit With Perfection! Worksheet Answer Key Peer Edit With Perfection! Handout
4 pages
Sequential Circuits Flip-Flops: Lecture 5 - 1.6-1.7
No ratings yet
Sequential Circuits Flip-Flops: Lecture 5 - 1.6-1.7
5 pages
Heep 201
No ratings yet
Heep 201
32 pages
REVA University Resume Format
No ratings yet
REVA University Resume Format
7 pages
Project
100% (1)
Project
95 pages
A Study of Relationship Betwee
No ratings yet
A Study of Relationship Betwee
18 pages
IP Addresses Are Broken Into The Two Components
No ratings yet
IP Addresses Are Broken Into The Two Components
4 pages
MC Donald
No ratings yet
MC Donald
75 pages
CRediT Taxonomy
No ratings yet
CRediT Taxonomy
1 page
HW2
No ratings yet
HW2
4 pages
Gravimetric Determination of Sulfur Trioxide in A Soluble Sulfate Salt
100% (1)
Gravimetric Determination of Sulfur Trioxide in A Soluble Sulfate Salt
6 pages
Fiedler's Contingency Model of Leadership
No ratings yet
Fiedler's Contingency Model of Leadership
11 pages
Golder GrowingGrowingGone 2004
No ratings yet
Golder GrowingGrowingGone 2004
13 pages
History of Strategic Air and Ballistic Missle Defense Vol II
100% (4)
History of Strategic Air and Ballistic Missle Defense Vol II
375 pages
MapReduce: Simplified Data Processing On Large Clusters
100% (1)
MapReduce: Simplified Data Processing On Large Clusters
13 pages
The Science of Passionate Interests: An Introduction To Gabriel Tarde's Economic Anthropology - Bruno Latour PDF
No ratings yet
The Science of Passionate Interests: An Introduction To Gabriel Tarde's Economic Anthropology - Bruno Latour PDF
87 pages
VPLEX 5.4 Product Guide
No ratings yet
VPLEX 5.4 Product Guide
158 pages
Quicktips Persuasiveorg
No ratings yet
Quicktips Persuasiveorg
2 pages
Bootstrap: Mark Otto and
No ratings yet
Bootstrap: Mark Otto and
12 pages
GRADE 12 - Print Players - Quizizz
No ratings yet
GRADE 12 - Print Players - Quizizz
22 pages
Instant download On the Fringe: Where Science Meets Pseudoscience Michael D. Gordin pdf all chapter
100% (3)
Instant download On the Fringe: Where Science Meets Pseudoscience Michael D. Gordin pdf all chapter
41 pages
CSE Department - IIT Madras PDF
No ratings yet
CSE Department - IIT Madras PDF
3 pages