0% found this document useful (0 votes)
3 views26 pages

Practicals Data

The document details the process of analyzing temperature data using R, including installing necessary packages, converting data to time series format, and performing various statistical tests to check for stationarity. It also covers the implementation of ARIMA models to forecast temperature trends and evaluates model residuals for normality. The analysis indicates an increasing trend in temperature data with seasonality and outliers present.

Uploaded by

kusiappiahhorass
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views26 pages

Practicals Data

The document details the process of analyzing temperature data using R, including installing necessary packages, converting data to time series format, and performing various statistical tests to check for stationarity. It also covers the implementation of ARIMA models to forecast temperature trends and evaluates model residuals for normality. The analysis indicates an increasing trend in temperature data with seasonality and outliers present.

Uploaded by

kusiappiahhorass
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Practicals

kusi Appiah Horass

2024-03-26

Installing Packages
library(forecast)

## Warning: package 'forecast' was built under R version 4.1.3

## Registered S3 method overwritten by 'quantmod':


## method from
## as.zoo.data.frame zoo

library(tseries)

## Warning: package 'tseries' was built under R version 4.1.3

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 4.1.3

library(seastests)

## Warning: package 'seastests' was built under R version 4.1.3

library(FinTS)

## Loading required package: zoo

## Warning: package 'zoo' was built under R version 4.1.3

##
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':


##
## as.Date, as.Date.numeric

##
## Attaching package: 'FinTS'

## The following object is masked from 'package:forecast':


##
## Acf

Changing Temperature data to time series data


library(readxl)
## Warning: package 'readxl' was built under R version 4.1.3

Practicals_data <-
read_excel("C:/Users/SOFO/Downloads/Practicals_data.xls")
View(Practicals_data)
attach(Practicals_data)
Temperature

## [1] 26.78 26.49 27.12 26.34 26.81 26.97 26.71 26.77 27.22 27.03
26.39 26.55
## [13] 27.15 26.64 26.27 26.53 26.79 26.48 26.53 26.76 27.02 27.08
27.32 27.34
## [25] 26.81 26.74 27.90 27.00 27.02 27.17 26.68 26.91 27.55 27.31
27.46 27.17
## [37] 27.41 27.59 27.26 27.14 27.43 27.86 27.23 27.15 27.43 27.61
27.81 27.43
## [49] 27.68 27.50 27.68 27.11 30.64 30.19 29.79 31.03 30.76 30.33
30.12

Temp = ts(Temperature, start = 1960, end = 2018, frequency = 1)


Temp

## Time Series:
## Start = 1960
## End = 2018
## Frequency = 1
## [1] 26.78 26.49 27.12 26.34 26.81 26.97 26.71 26.77 27.22 27.03
26.39 26.55
## [13] 27.15 26.64 26.27 26.53 26.79 26.48 26.53 26.76 27.02 27.08
27.32 27.34
## [25] 26.81 26.74 27.90 27.00 27.02 27.17 26.68 26.91 27.55 27.31
27.46 27.17
## [37] 27.41 27.59 27.26 27.14 27.43 27.86 27.23 27.15 27.43 27.61
27.81 27.43
## [49] 27.68 27.50 27.68 27.11 30.64 30.19 29.79 31.03 30.76 30.33
30.12

Temp2 = ts(Temperature, start = c(1960,1), frequency = 12)


Temp2

## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
Nov Dec
## 1960 26.78 26.49 27.12 26.34 26.81 26.97 26.71 26.77 27.22 27.03
26.39 26.55
## 1961 27.15 26.64 26.27 26.53 26.79 26.48 26.53 26.76 27.02 27.08
27.32 27.34
## 1962 26.81 26.74 27.90 27.00 27.02 27.17 26.68 26.91 27.55 27.31
27.46 27.17
## 1963 27.41 27.59 27.26 27.14 27.43 27.86 27.23 27.15 27.43 27.61
27.81 27.43
## 1964 27.68 27.50 27.68 27.11 30.64 30.19 29.79 31.03 30.76 30.33
30.12

Plotting Temperature time series data


plot(Temp, xlab = "Years", ylab = "Temperature", main = "Time Series
Plot Of Temperature Time Series Data")

There is an increasing trend, an out lier and seasonality is available


autoplot(Temp, col = "blue", lwd = 1.5, main = "Time Series Plot Of
Temperature time Series Data") + xlab("Years") + ylab("Temperraature")
There is an increasing trend, an out lier and seasonality is available
#Testing if the Tempeerature Time Series data is Stationary
adf.test(Temp)

##
## Augmented Dickey-Fuller Test
##
## data: Temp
## Dickey-Fuller = -1.1471, Lag order = 3, p-value = 0.9075
## alternative hypothesis: stationary

pp.test(Temp)

##
## Phillips-Perron Unit Root Test
##
## data: Temp
## Dickey-Fuller Z(alpha) = -15.405, Truncation lag parameter = 3, p-
value
## = 0.1914
## alternative hypothesis: stationary

kpss.test(Temp)

## Warning in kpss.test(Temp): p-value smaller than printed p-value


##
## KPSS Test for Level Stationarity
##
## data: Temp
## KPSS Level = 0.96866, Truncation lag parameter = 3, p-value = 0.01

Acf(Temp)

The Acf is plotting decay exponential


Pacf(Temp)
The Pacf cut off at lag 1
welch(Temp2,freq = 12)

## Test used: Kruskall Wallis


##
## Test statistic: 1.45
## P-value: 0.2335604

From the Unity Test, the Temperature time series data is not stationary
ndiffs(Temp)

## [1] 1

Temp_1stD = diff(Temp)
Temp_1stD

## Time Series:
## Start = 1961
## End = 2018
## Frequency = 1
## [1] -0.29 0.63 -0.78 0.47 0.16 -0.26 0.06 0.45 -0.19 -0.64
0.16 0.60
## [13] -0.51 -0.37 0.26 0.26 -0.31 0.05 0.23 0.26 0.06 0.24
0.02 -0.53
## [25] -0.07 1.16 -0.90 0.02 0.15 -0.49 0.23 0.64 -0.24 0.15 -
0.29 0.24
## [37] 0.18 -0.33 -0.12 0.29 0.43 -0.63 -0.08 0.28 0.18 0.20 -
0.38 0.25
## [49] -0.18 0.18 -0.57 3.53 -0.45 -0.40 1.24 -0.27 -0.43 -0.21

acf(Temp_1stD)

Pacf(Temp_1stD)
m1 = Arima(Temp, order = c(1,1,0))
m1

## Series: Temp
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## -0.3023
## s.e. 0.1241
##
## sigma^2 = 0.3637: log likelihood = -52.51
## AIC=109.02 AICc=109.24 BIC=113.14

m2 = Arima(Temp, order = c(0,1,1))


m2

## Series: Temp
## ARIMA(0,1,1)
##
## Coefficients:
## ma1
## -0.3911
## s.e. 0.1140
##
## sigma^2 = 0.3478: log likelihood = -51.25
## AIC=106.5 AICc=106.72 BIC=110.62
m3 = Arima(Temp, order = c(1,2,0))
m3

## Series: Temp
## ARIMA(1,2,0)
##
## Coefficients:
## ar1
## -0.5470
## s.e. 0.1096
##
## sigma^2 = 0.74: log likelihood = -71.97
## AIC=147.94 AICc=148.16 BIC=152.02

m4 = Arima(Temp, order = c(1,1,0), include.mean = TRUE)


m4

## Series: Temp
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## -0.3023
## s.e. 0.1241
##
## sigma^2 = 0.3637: log likelihood = -52.51
## AIC=109.02 AICc=109.24 BIC=113.14

m5 = Arima(Temp, order = c(1,1,0), include.constant = TRUE)


m5

## Series: Temp
## ARIMA(1,1,0) with drift
##
## Coefficients:
## ar1 drift
## -0.3156 0.0602
## s.e. 0.1237 0.0594
##
## sigma^2 = 0.3638: log likelihood = -52.01
## AIC=110.01 AICc=110.46 BIC=116.19

m6 = Arima(Temp, order = c(1,1,0), include.drift = TRUE)


m6

## Series: Temp
## ARIMA(1,1,0) with drift
##
## Coefficients:
## ar1 drift
## -0.3156 0.0602
## s.e. 0.1237 0.0594
##
## sigma^2 = 0.3638: log likelihood = -52.01
## AIC=110.01 AICc=110.46 BIC=116.19

m7 = Arima(Temp, order = c(1,1,0), method = "ML")


m7

## Series: Temp
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## -0.3023
## s.e. 0.1241
##
## sigma^2 = 0.3637: log likelihood = -52.51
## AIC=109.02 AICc=109.24 BIC=113.14

m8 = Arima(Temp, order = c(1,1,0), method = "CSS")


m8

## Series: Temp
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## -0.3064
## s.e. 0.1249
##
## sigma^2 = 0.3623: log likelihood = -52.86

m9 = Arima(Temp, order = c(1,1,0), method = "CSS-ML")


m9

## Series: Temp
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## -0.3023
## s.e. 0.1241
##
## sigma^2 = 0.3637: log likelihood = -52.51
## AIC=109.02 AICc=109.24 BIC=113.14

m10 = auto.arima(Temp, stepwise = FALSE, approximation = FALSE)


m10

## Series: Temp
## ARIMA(2,1,0)
##
## Coefficients:
## ar1 ar2
## -0.3926 -0.2928
## s.e. 0.1250 0.1248
##
## sigma^2 = 0.3373: log likelihood = -49.9
## AIC=105.8 AICc=106.24 BIC=111.98

To check for the residuals or the errors(forecast checkresidual)


checkresiduals(m10)

##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,1,0)
## Q* = 2.6862, df = 8, p-value = 0.9525
##
## Model df: 2. Total lags used: 10

Test for Normality


shapiro.test(Temp)

##
## Shapiro-Wilk normality test
##
## data: Temp
## W = 0.71696, p-value = 2.355e-09
#Simulation of Models
AR(p) Process
AR(1)
When seed is set, the values will always change or won’t be the same.
ar_1 = arima.sim(model = list(order = c(1,0,0), ar = 0.9), n = 50)
ar_1

## Time Series:
## Start = 1
## End = 50
## Frequency = 1
## [1] 3.6240468 3.1619944 2.7262616 1.4617361 0.1594062 -
0.9990428
## [7] 0.5679011 0.9937917 2.9393337 4.1729345 3.6752258
1.5602923
## [13] 0.3348198 1.3264967 2.3730533 0.5665732 1.3322968
2.6560589
## [19] 1.6441461 2.0083359 1.8210696 1.5865592 0.8130239 -
0.2376462
## [25] -0.3750907 -0.5328531 -0.9282527 0.7978363 0.6974899
1.2933855
## [31] 1.5471802 0.5906372 2.5771617 1.7228949 1.7359278
1.2451136
## [37] 2.2345616 1.1835775 3.3623689 1.1836543 0.9794413
1.1100810
## [43] 1.9725491 1.2638786 2.4679455 3.9608947 2.8370739
3.6160478
## [49] 3.0430844 2.2972822

ar_11 = arima.sim(model = list(order = c(1,0,0), ar = 0.9), n =50)


ar_11

## Time Series:
## Start = 1
## End = 50
## Frequency = 1
## [1] -2.07745229 -5.60444701 -5.93985183 -4.05947355 -3.53785206 -
4.55698747
## [7] -3.61119227 -3.25885168 -3.94758501 -3.18181166 -3.93596061 -
1.29998166
## [13] -1.12935406 -0.96655326 0.05037913 0.07708215 -0.38259663 -
1.24786363
## [19] -1.87579980 -0.49023954 1.27825139 1.71838019 1.91797668
1.99374333
## [25] 1.56751293 1.11629500 2.03621292 1.53445771 3.31016581
1.60252581
## [31] 1.49800681 0.95611768 1.74560006 1.06917817 2.00181333
1.13355331
## [37] 1.16296254 1.88760475 2.96797843 1.23129843 1.57862179
0.15186925
## [43] -0.61770980 -0.84314143 -1.23726260 -0.82102553 -1.88162012 -
1.68363074
## [49] -2.93895795 -1.86882311

# From the outputs, we can see there that the values have change us
compare to the ar_1

Setting Seed to have a constant value.


set.seed(123)
ar1 = arima.sim(model = list(order = c(1,0,0), ar = 0.9), n = 50)
ar1

## Time Series:
## Start = 1
## End = 50
## Frequency = 1
## [1] 1.40975469 1.48472079 1.71588819 1.04197592 0.60457094 -
0.47446153
## [7] -1.49880661 -1.04539730 -0.49264780 -0.39037879 0.57092656
2.56391859
## [13] 1.81649556 -0.67432287 0.39884794 -0.35023761 -1.00322247
0.12267115
## [19] -0.17436897 -1.37764979 -1.05858133 -1.09161456 -0.97668892 -
0.49373963
## [25] -0.81502569 -0.08914658 -0.30071848 0.06113533 1.15186081
1.47185622
## [31] 0.99873901 2.04767273 2.83640931 3.10116534 3.02978054
2.09889641
## [37] 3.24965922 2.32443371 4.27932333 5.38400163 4.60990110
3.12249009
## [43] 2.09983452 2.14673478 1.68536942 1.16928988 0.10074232
0.04564037
## [49] -0.74382814 -2.33738726

set.seed(123)
ar1n52 = arima.sim(model = list(order = c(1,0,0), ar = 0.9),n = 52)
ar1n52

## Time Series:
## Start = 1
## End = 52
## Frequency = 1
## [1] 1.40975469 1.48472079 1.71588819 1.04197592 0.60457094 -
0.47446153
## [7] -1.49880661 -1.04539730 -0.49264780 -0.39037879 0.57092656
2.56391859
## [13] 1.81649556 -0.67432287 0.39884794 -0.35023761 -1.00322247
0.12267115
## [19] -0.17436897 -1.37764979 -1.05858133 -1.09161456 -0.97668892 -
0.49373963
## [25] -0.81502569 -0.08914658 -0.30071848 0.06113533 1.15186081
1.47185622
## [31] 0.99873901 2.04767273 2.83640931 3.10116534 3.02978054
2.09889641
## [37] 3.24965922 2.32443371 4.27932333 5.38400163 4.60990110
3.12249009
## [43] 2.09983452 2.14673478 1.68536942 1.16928988 0.10074232
0.04564037
## [49] -0.74382814 -2.33738726 -2.48387506 -1.31649094

MA(q) Process
Setting seed to have a constant value
set.seed(123)
ma1 = arima.sim(model = list(order = c(0,0,1), ma = 0.5), n = 52)
ma1

## Time Series:
## Start = 1
## End = 52
## Frequency = 1
## [1] -0.510415313 1.443619569 0.849862548 0.164541931
1.779708854
## [6] 1.318448699 -1.034603132 -1.319383469 -0.789088396
1.001250812
## [11] 0.971854726 0.580678364 0.311068441 -0.500499777
1.508992569
## [16] 1.391307047 -1.717691918 -0.281952677 -0.122113457 -
1.304219410
## [21] -0.751886768 -1.134991906 -1.241893453 -0.989484882 -
1.999212945
## [26] -0.005559611 0.572266640 -1.061450378 0.684746453
1.053371682
## [31] -0.081839372 0.747589920 1.325696318 1.260647825
1.099430795
## [36] 0.898237781 0.215047116 -0.336918519 -0.533452333 -
0.884942479
## [41] -0.555270767 -1.369354991 1.536257790 2.292439981 -
0.519127584
## [46] -0.964439127 -0.668097771 0.546637442 0.306613493
0.211633981
## [51] 0.098112502 -0.057143835

set.seed(123)
ma1n100 = arima.sim(model = list(order = c(0,0,1), ma = 0.5), n = 100)
ma1n100

## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] -0.510415313 1.443619569 0.849862548 0.164541931
1.779708854
## [6] 1.318448699 -1.034603132 -1.319383469 -0.789088396
1.001250812
## [11] 0.971854726 0.580678364 0.311068441 -0.500499777
1.508992569
## [16] 1.391307047 -1.717691918 -0.281952677 -0.122113457 -
1.304219410
## [21] -0.751886768 -1.134991906 -1.241893453 -0.989484882 -
1.999212945
## [26] -0.005559611 0.572266640 -1.061450378 0.684746453
1.053371682
## [31] -0.081839372 0.747589920 1.325696318 1.260647825
1.099430795
## [36] 0.898237781 0.215047116 -0.336918519 -0.533452333 -
0.884942479
## [41] -0.555270767 -1.369354991 1.536257790 2.292439981 -
0.519127584
## [46] -0.964439127 -0.668097771 0.546637442 0.306613493
0.211633981
## [51] 0.098112502 -0.057143835 1.347167055 0.458530156
1.403585112
## [56] -0.790517502 -0.189762652 0.416161119 0.277868691
0.487610267
## [61] -0.312503712 -0.584369110 -1.185179075 -1.581078918 -
0.232366972
## [66] 0.599974099 0.277109116 0.948769581 2.511218420
0.534011177
## [71] -2.554684459 -0.148845913 -0.206331500 -1.042608998
0.681567061
## [76] 0.228012678 -1.363104216 -0.429055376 -0.048239623 -
0.063681495
## [81] 0.388162494 -0.178019831 0.459046533 0.101701712
0.221538683
## [86] 1.262729995 0.983600997 -0.108340840 0.985841826
1.567907665
## [91] 1.045148887 0.512930215 -0.508540208 1.046699411
0.080066637
## [96] 1.887203199 2.626277123 0.530604954 -1.144271080 -
1.223617014

Find the best model


m1 = Arima(ar1, order = c(1,0,0), method = "ML")
m1

## Series: ar1
## ARIMA(1,0,0) with non-zero mean
##
## Coefficients:
## ar1 mean
## 0.8671 0.5915
## s.e. 0.0704 0.8526
##
## sigma^2 = 0.8226: log likelihood = -65.74
## AIC=137.48 AICc=138 BIC=143.22

# Wihtout Mean
m2 = Arima(ar1, order = c(1,0,0), include.mean = FALSE, method = "ML")
m2

## Series: ar1
## ARIMA(1,0,0) with zero mean
##
## Coefficients:
## ar1
## 0.8859
## s.e. 0.0619
##
## sigma^2 = 0.81: log likelihood = -65.94
## AIC=135.88 AICc=136.14 BIC=139.71

m3 = Arima(ar1, order = c(1,0,0), include.mean = FALSE, method =


"CSS")
m3

## Series: ar1
## ARIMA(1,0,0) with zero mean
##
## Coefficients:
## ar1
## 0.8943
## s.e. 0.0663
##
## sigma^2 = 0.801: log likelihood = -65.4

m4 = Arima(ar1, order = c(1,0,0), include.mean = FALSE, method = "CSS-


ML")
m4

## Series: ar1
## ARIMA(1,0,0) with zero mean
##
## Coefficients:
## ar1
## 0.8859
## s.e. 0.0619
##
## sigma^2 = 0.81: log likelihood = -65.94
## AIC=135.88 AICc=136.14 BIC=139.71
Best model always the one which values in close to the actual values. We then conclude that
the model is m3 and always “CSS” gives the best model as compare to “ML and CSS-ML”
For n = 100, ar1
m1 = Arima(ar1, order = c(1,0,0), include.mean = FALSE, method = "ML")
m1

## Series: ar1
## ARIMA(1,0,0) with zero mean
##
## Coefficients:
## ar1
## 0.8859
## s.e. 0.0619
##
## sigma^2 = 0.81: log likelihood = -65.94
## AIC=135.88 AICc=136.14 BIC=139.71

m2 = Arima(ar1, order = c(1,0,0), include.mean = FALSE, method =


"CSS")
m2

## Series: ar1
## ARIMA(1,0,0) with zero mean
##
## Coefficients:
## ar1
## 0.8943
## s.e. 0.0663
##
## sigma^2 = 0.801: log likelihood = -65.4

m3 = Arima(ar1, order = c(1,0,0), include.mean = FALSE, method = "CSS-


ML")
m3

## Series: ar1
## ARIMA(1,0,0) with zero mean
##
## Coefficients:
## ar1
## 0.8859
## s.e. 0.0619
##
## sigma^2 = 0.81: log likelihood = -65.94
## AIC=135.88 AICc=136.14 BIC=139.71

From this the best model always the one which values in close to the actual values. We then
conclude that the model is m3 and always “CSS” gives the best model as compare to “ML
and CSS-ML”
For Mixed model
set.seed(123)
arma = arima.sim(model = list( order = c(1,0,1), ar = 0.9, ma = 0.5),
n = 100)
arma

## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] 2.190717222 2.459255767 1.900826479 1.126374721 -
0.171441826
## [6] -1.735376562 -1.794205877 -1.014811190 -0.636220955
0.376170721
## [11] 2.849772069 3.098806039 0.234240976 0.061970965 -
0.150557632
## [16] -1.178110866 -0.378732718 -0.112846769 -1.464666307 -
1.747255053
## [21] -1.620769170 -1.522373749 -0.981973880 -1.061796323 -
0.496570158
## [26] -0.345211430 -0.089151604 1.182493552 2.047845194
1.734719834
## [31] 2.547089677 3.860288374 4.519408424 4.580397797
3.613817809
## [36] 4.299135438 3.949288531 5.441562878 7.523683713
7.301920295
## [41] 5.427457186 3.661094454 3.196665436 2.758748868
2.011985443
## [46] 0.685397031 0.096020320 -0.721000044 -2.709294211 -
3.652562278
## [51] -2.558422702 -2.418429090 -1.856295340 -2.984566353 -
3.550613037
## [56] -2.703925512 -1.872675997 -1.429155522 -1.874107881 -
2.856754443
## [61] -4.020059962 -4.012471764 -4.499875904 -5.014183064 -
5.014135672
## [66] -2.796906195 -2.247234475 -2.113099406 -1.706135330 -
2.458398006
## [71] -2.764794609 -1.079418332 0.202302983 0.449057633
0.002271499
## [76] -2.262451289 -1.931492557 -2.633314766 -2.360355814
0.154757092
## [81] -0.350059993 -0.335216239 -0.212999937 -1.894942847 -
4.006188296
## [86] -5.964439467 -6.699670129 -7.756911962 -7.024181785 -
3.877696280
## [91] -3.726902658 -3.209988782 -1.726078239 -0.836746716 -
1.595347363
## [96] -2.059453538 -2.193629822 -1.551474975 -1.487271467 -
0.547790311
set.seed(123)
ar2 = arima.sim(model = list(order = c(2,0,0),ar = c(0.1,0.5)), n =
100)
ar2

## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] -1.466924827 -2.447190283 -0.140394397 -1.084261463 -
1.316760282
## [6] 0.580008161 -0.173915103 -0.022458913 0.805922218
0.947496253
## [11] 1.319291816 1.294317562 1.342995318 0.719546602
0.437489655
## [16] 0.023051266 -0.473657025 -0.243757348 -1.526600599
1.894417232
## [21] 0.634103422 -0.112489625 -0.097082087 -0.532608375
0.678163237
## [26] -0.281856930 0.564214440 -0.113053776 0.227931385
1.334868534
## [31] 0.021681560 2.186073028 -1.319304721 1.545719791 -
0.381226138
## [36] 0.950678851 0.284094299 0.001425402 -0.191017694 -
1.036964451
## [41] -1.270996519 -0.342053236 -0.221493804 -0.140171772
0.797503389
## [46] 2.059749139 0.113695442 -1.267924762 0.935793769 -
1.249583767
## [51] -0.345070108 0.366272475 -0.420680814 -1.079649556 -
0.137001883
## [56] -0.692416329 -0.131978388 0.025874398 -0.434061786
0.613907569
## [61] -0.376126698 0.601123079 0.968887972 0.832631827
0.241775583
## [66] 1.589301090 1.273321757 1.470379680 1.022430581
0.209526822
## [71] 1.892820421 -0.306214134 3.103121790 1.689815738
1.484842110
## [76] -0.033028820 0.028711609 0.243240460 -0.208012028 -
0.246723572
## [81] -1.080296938 -0.276419205 -1.352694859 -1.941421025 -
1.250716052
## [86] -0.176785509 -1.218383540 0.397733214 -2.187301157 -
0.075425474
## [91] -0.581785922 0.205262033 -0.164690563 -0.554544048 -
0.987504033
## [96] -1.400151218 -0.516120541 -1.699162277 -0.918533942 -
1.197526725
set.seed(123)
ma2 = arima.sim(model = list(order = c(0,0,2), ma = c(0.5,0.8)), n =
100)
ma2

## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] 0.995239052 0.665720557 1.411508582 1.836115568
1.421878888
## [6] 0.337448858 -0.950650504 -1.801137384 0.451768531
0.615325150
## [11] 1.559943802 0.598919503 -0.179882616 1.597538742
0.946634139
## [16] -0.288161408 0.116327706 -1.695407182 -0.743134689 -
1.130119894
## [21] -1.989250870 -1.416273385 -1.810288441 -2.582325928 -
0.505591025
## [26] -0.777088009 -0.391220742 0.807444947 0.142862132
0.921212565
## [31] 1.088761297 1.089639132 1.976748354 1.801937585
1.555502646
## [36] 0.765959319 0.106215604 -0.582981701 -1.129712610 -
0.859647568
## [41] -1.925120574 1.369923967 1.280122900 1.216037188
0.001930472
## [46] -1.566584638 0.224329573 -0.066710790 0.835606075
0.031417248
## [51] 0.145510976 1.324329651 0.424233791 2.498466939 -
0.971134291
## [56] 1.023413831 -0.822841125 0.745559690 0.586693662 -
0.139750457
## [61] -0.280657524 -1.587037837 -1.847644825 -1.047227278 -
0.257458882
## [66] 0.519932029 1.307337404 2.553621801 1.271825151 -
0.914616710
## [71] -0.541670846 -2.053666601 -0.238018178 0.114206451 -
0.322394215
## [76] -0.542647120 -0.656873782 -1.024813792 0.081361288
0.277049404
## [81] -0.173408483 0.767270854 -0.194826313 0.737039922
1.086340746
## [86] 1.249026569 0.769130370 1.333987018 1.307162397
1.964194982
## [91] 1.307733300 -0.069822641 1.237684799 -0.422258224
2.975725158
## [96] 2.146069453 2.280471348 0.081817421 -1.412177301 -
0.919456293
set.seed(123)
arma2 = arima.sim(model = list(order = c(2,0,2), ar = c(0.1,0.3), ma =
c(0.4,0.5)), n = 100)
arma2

## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] 1.16532635 -0.20584833 0.49264735 -1.18804739 -0.87727285 -
1.32564160
## [7] -2.04285228 -1.85025818 -2.22747949 -3.07898003 -1.12555176 -
1.39110791
## [13] -1.13467049 0.34444728 0.05296530 0.61105238 1.06732401
1.37869613
## [19] 2.07846412 2.07779468 2.07148300 1.33446218 0.70112260 -
0.06336101
## [25] -0.79587603 -0.77463148 -2.01214271 1.12523507 0.95172690
0.87729741
## [31] 0.12510054 -0.91366430 0.33802429 -0.24500756 0.68685998
0.02627985
## [37] 0.28105608 1.37317029 0.52186855 2.57460129 -0.64102936
1.43162538
## [43] -0.46582293 0.94069546 0.48226590 0.08793835 -0.19084342 -
1.39572290
## [49] -1.84265039 -1.23745745 -0.64281524 -0.05146630 0.96958285
2.52701218
## [55] 1.33371251 -0.58906409 0.17776274 -1.62043275 -0.57753411 -
0.14811569
## [61] -0.40662057 -0.90693800 -0.66205008 -1.01501523 -0.25925717 -
0.01228989
## [67] -0.29267192 0.65579858 -0.17028768 0.74548642 1.14277086
1.37773109
## [73] 0.87716488 1.73706154 1.72691673 2.21401245 1.69431871
0.57542070
## [79] 1.79469357 -0.01785608 3.16417785 2.41847499 2.66211124
0.63735789
## [85] -0.37645594 -0.38692759 -0.65077122 -0.49893290 -1.45910620 -
0.89503694
## [91] -1.80596040 -2.45352471 -2.22699612 -1.02582199 -1.16854261
0.41272298
## [97] -1.97166095 -0.47208209 -0.95046543 0.24446409

Best Model for arma2


m1 = Arima(arma2, order = c(2,0,2), include.mean = FALSE, method =
"ML")
m1

## Series: arma2
## ARIMA(2,0,2) with zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2
## 0.3227 0.1785 0.1862 0.4912
## s.e. 0.1992 0.1902 0.1817 0.1229
##
## sigma^2 = 0.7813: log likelihood = -128.16
## AIC=266.33 AICc=266.97 BIC=279.36

m2 = Arima(arma2, order = c(2,0,2), include.mean = FALSE, method =


"CSS")
m2

## Series: arma2
## ARIMA(2,0,2) with zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2
## 0.3028 0.2105 0.2083 0.4780
## s.e. 0.1840 0.1774 0.1665 0.1254
##
## sigma^2 = 0.7774: log likelihood = -128.28

m3 = Arima(arma2, order = c(2,0,2), include.mean = FALSE, method =


"CSS-ML")
m3

## Series: arma2
## ARIMA(2,0,2) with zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2
## 0.3227 0.1785 0.1862 0.4912
## s.e. 0.1992 0.1902 0.1817 0.1229
##
## sigma^2 = 0.7813: log likelihood = -128.16
## AIC=266.33 AICc=266.97 BIC=279.36

Cross Validation
library(readxl)
Practicals_data <-
read_excel("C:/Users/SOFO/Downloads/Practicals_data.xls")
View(Practicals_data)
attach(Practicals_data)

## The following objects are masked from Practicals_data (pos = 3):


##
## Maize, Precipitation, Temperature

Maize
## [1] 9536 9283 9039 8537 12062 16026 11640 11089 11047 10643
10758 10358
## [13] 10525 11431 10741 10438 10703 10634 10615 8682 10161 9276
4300 9613
## [25] 10086 11845 10907 13907 12610 11889 15260 12040 15092 14933
15020 15153
## [37] 15285 14850 14557 14578 13150 14900 16272 15794 15613 14994
15437 17371
## [49] 16969 18874 16458 18712 17240 17291 19218 19500 20112 19474
20697

maize_ts = ts(Maize, start = c(1960,1), frequency = 12)


maize_ts

## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
Nov Dec
## 1960 9536 9283 9039 8537 12062 16026 11640 11089 11047 10643
10758 10358
## 1961 10525 11431 10741 10438 10703 10634 10615 8682 10161 9276
4300 9613
## 1962 10086 11845 10907 13907 12610 11889 15260 12040 15092 14933
15020 15153
## 1963 15285 14850 14557 14578 13150 14900 16272 15794 15613 14994
15437 17371
## 1964 16969 18874 16458 18712 17240 17291 19218 19500 20112 19474
20697

Training and Testing the data


traindata = window(maize_ts, start = c(1960,1), end = c(1964,1))
traindata

## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
Nov Dec
## 1960 9536 9283 9039 8537 12062 16026 11640 11089 11047 10643
10758 10358
## 1961 10525 11431 10741 10438 10703 10634 10615 8682 10161 9276
4300 9613
## 1962 10086 11845 10907 13907 12610 11889 15260 12040 15092 14933
15020 15153
## 1963 15285 14850 14557 14578 13150 14900 16272 15794 15613 14994
15437 17371
## 1964 16969

testdata = window(maize_ts, start = c(1964,2))


testdata

## Feb Mar Apr May Jun Jul Aug Sep Oct Nov
## 1964 18874 16458 18712 17240 17291 19218 19500 20112 19474 20697

length(testdata)
## [1] 10

Best model
welch(traindata,freq = 12)

## Test used: Kruskall Wallis


##
## Test statistic: 1.5
## P-value: 0.2339556

We conclude that there is no seasonality in the series.


m1 = auto.arima(traindata, stepwise = FALSE, approximation = FALSE)
m1

## Series: traindata
## ARIMA(0,1,1)
##
## Coefficients:
## ma1
## -0.4713
## s.e. 0.1384
##
## sigma^2 = 2979222: log likelihood = -425.5
## AIC=855 AICc=855.27 BIC=858.75

m2 = Arima(traindata, order = c(1,1,0))


m2

## Series: traindata
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## -0.3361
## s.e. 0.1340
##
## sigma^2 = 3156199: log likelihood = -426.82
## AIC=857.64 AICc=857.91 BIC=861.38

Since the AIC of m1 is the smallest then the best model is m1


Forecasting for each model to the know the best model
f1 = forecast(m1, h = length(testdata))
f1

## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 1964 16712.91 14500.90 18924.93 13329.93 20095.90
## Mar 1964 16712.91 14210.75 19215.08 12886.18 20539.65
## Apr 1964 16712.91 13950.91 19474.92 12488.79 20937.04
## May 1964 16712.91 13713.50 19712.33 12125.70 21300.13
## Jun 1964 16712.91 13493.54 19932.29 11789.31 21636.52
## Jul 1964 16712.91 13287.69 20138.14 11474.48 21951.35
## Aug 1964 16712.91 13093.52 20332.31 11177.52 22248.31
## Sep 1964 16712.91 12909.25 20516.58 10895.71 22530.12
## Oct 1964 16712.91 12733.50 20692.33 10626.93 22798.90
## Nov 1964 16712.91 12565.20 20860.63 10369.53 23056.30

f2 = forecast(m2, h = length(testdata))
f2

## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 1964 17104.13 14827.37 19380.90 13622.120 20586.14
## Mar 1964 17058.71 14325.93 19791.49 12879.282 21238.13
## Apr 1964 17073.98 13818.77 20329.18 12095.566 22052.39
## May 1964 17068.84 13404.66 20733.03 11464.960 22672.73
## Jun 1964 17070.57 13026.47 21114.67 10885.653 23255.49
## Jul 1964 17069.99 12682.52 21457.46 10359.932 23780.05
## Aug 1964 17070.18 12363.14 21777.23 9871.385 24268.98
## Sep 1964 17070.12 12064.24 22076.00 9414.287 24725.95
## Oct 1964 17070.14 11782.17 22358.12 8982.881 25157.40
## Nov 1964 17070.13 11514.41 22625.86 8573.383 25566.88

Checking the Accuracy of the forecast model


a1 = accuracy(f1,testdata)
a1

## ME RMSE MAE MPE MAPE


MASE
## Training set 278.8484 1690.450 1125.740 -0.1651948 10.61235
0.4401930
## Test set 2044.6850 2417.379 2095.668 10.4641743 10.77395
0.8194595
## ACF1 Theil's U
## Training set -0.003953328 NA
## Test set 0.310518088 1.582129

a2 = accuracy(f2,testdata)
a2

## ME RMSE MAE MPE MAPE


MASE
## Training set 205.9387 1739.935 1147.436 -0.613419 10.889739
0.4486766
## Test set 1684.9202 2120.330 1805.062 8.537876 9.267864
0.7058250
## ACF1 Theil's U
## Training set -0.1002318 NA
## Test set 0.3168790 1.390824

We conclude that the best model for forecasting is m2.


We then say that not always the best model can be the best forecast model.

You might also like