0% found this document useful (0 votes)
18 views96 pages

Final Thesis: Master's Degree in Economics and Finance

This thesis explores a new univariate volatility modeling approach that combines the GARCH model with the Volatility Index (VIX) to better estimate financial market volatility. It examines the characteristics of volatility, including clustering and the impact of exogenous variables, using empirical data from major indices. The goal is to determine if incorporating the VIX improves the predictive accuracy of the GARCH model compared to traditional methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views96 pages

Final Thesis: Master's Degree in Economics and Finance

This thesis explores a new univariate volatility modeling approach that combines the GARCH model with the Volatility Index (VIX) to better estimate financial market volatility. It examines the characteristics of volatility, including clustering and the impact of exogenous variables, using empirical data from major indices. The goal is to determine if incorporating the VIX improves the predictive accuracy of the GARCH model compared to traditional methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

Master’s Degree in Economics and Finance

Economic’s Department

Final Thesis

A Univariate Volatility Modelling combining the GARCH model with


the Volatility Index VIX

Supervisor: Ch.mo Prof. Giovanni Angelini


Assistant Supervisor: Ch.mo Prof. Monica Billio

Graduand:
Karim Hasouna,
N° 859207

Academic Year 2019-2020


2
Acknowledgements

Questa tesi non sarebbe stata possibile senza il supporto e l’inspirazione


delle persone che menziono - caldamente ringrazio tutte le persone che
hanno fatto parte di questo viaggio con me aiutandomi a realizzare questa
tesi. Un sentito grazie al mio Professore Giovanni Angelini per la sua
disponibilità e tempestività ad aiutarmi nella stesura della tesi e che mi
ha fornito l’idea durante il suo corso di "Risk Measurement". Ringrazio
i miei amici e la mia famiglia per avermi fornito supporto e incorag-
giamento continuo durante il mio percorso universitario. Tutto ciò non
sarebbe stato possibile senza tutti voi.
Grazie.
Karim Hasouna

"We all know that art is not truth. Art is a lie that makes us realize truth, at
least the truth that is given us to understand. The artist must know the manner
whereby to convince others of the truthfulness of his lies". Pablo Picasso. 1923
Contents

1 Introduction and brief history 7

2 Chapter 1: The structure of the model 10


2.1 Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Volatility Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Why does volatility change? . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Univariate Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Moving Average models . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.7 EWMA model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.8 The ARCH/GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . 21
2.8.1 ARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.8.2 The GARCH model . . . . . . . . . . . . . . . . . . . . . . . . 24
2.9 Stochastic Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.10 The VIX, the Index of the implied volatility . . . . . . . . . . . . 26
2.10.1 VXO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.10.2 VIX as a Variance Swaps . . . . . . . . . . . . . . . . . . . . 30
2.10.3 Interpretation of the VIX . . . . . . . . . . . . . . . . . . . . 39
2.10.4 Alternatives of the VIX . . . . . . . . . . . . . . . . . . . . . 40
2.11 Realized Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.12 Intro of the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3 Chapter 2: Empirical Research 43


3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Distribution of Returns . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5 Squared Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 The Heteroskedasticity . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.7 The empirical EWMA . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.8 The structure of the GARCH VIX model . . . . . . . . . . . . . . 71
3.9 The Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . 72
3.10 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.11 Diagnosing volatility models . . . . . . . . . . . . . . . . . . . . . . 77
3.12 Backtesting and VaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4 Concluding remarks 85

4
5 Bibliography 88

A Appendix 92
A.1 A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
A.2 B Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5
Abstract

In this thesis I shall examine the characteristics of the volatility in the finan-
cial markets. In particular, the volatility is extrapolated both from the historical
volatility by time series of past market prices and from derivative instruments pro-
viding an implied volatility. The first part explores the causes of volatility, espe-
cially volatility clustering, and explain the behavioural reactions of the stockhold-
ers. It is a well-known fact that there are GARCH models and many others that are
accurate and useful to estimate the conditional variance. Anyway, looking the his-
torical returns could be not be enough to fit the model on the data. Our purpose is
to create a non-linear Univariate model to evaluate the financial markets using the
Generalised Autoregressive Conditional Heteroskedasticity (GARCH) model with
the CBOE Volatility Index (VIX) as a exogenous variable. The exogenous variable
VIX is independent of the GARCH model but is included in the new model we
want to realize. Using the daily rates of return of 10 major indices, we want to de-
termine if the new model created, adding an exogenous variable, is better than
the single GARCH model. Therefore, the empirical analysis analyses the volatility
implementing the GARCH with the exogenous implied volatility, determined look
forward in time, being derived from the market price of a market-traded deriva-
tive. It is using the Variance Swaps, based on the S&P 500 Index, the core index for
U.S. equities, and estimates expected volatility by aggregating the weighted prices
of S&P 500 puts and call plain vanilla options over a wide range of strike prices.
By empirically examining the time series of different world indices we hope to
produce a more complete understanding of the utility of the VIX into the GARCH
models.

JEL Codes: C580

[LaTeX typeset]

6
1 Introduction and brief history

The volatility is defined by the standard deviation of the trading price

series over time and is essentially a measure of deviation from the ex-

pected value. It is a symmetric measure, which consider both positive

and negative deviation. We can calculate the volatility using the histori-

cal volatility, which measures a time series of past market prices, or with

the implied volatility, using a derivative instrument.

The objective of this thesis is to study a method for modelling the

volatility of some indices returns. A peculiar characteristic of stock volatil-

ity is that is not directly observable, but can be estimated from historic

data, or using an option pricing model. From the returns is not possible

to observe the daily volatility because there is only one observation in a

trading day but is possible to estimate the daily volatility using the in-

traday data of the stock. Obviously, the high-frequency intraday returns

contain only very limited information about the overnight volatility. The

returns can be analysed using different time orders like years, months,

days, hours or minutes but each one have some problems and choose

the accurate one depends of the model. There are various character-

istics that are commonly seen in asset returns. The most important is

the volatility clustering where the volatility may be high for certain time

periods and low for other periods; but the volatility follows also other

properties such as the volatility evolves over time with a continuous be-

havior, but does not diverge to infinity. We notice that there is also a

7
leverage effect, meaning the volatility react differently to a big price in-

crease or a big price drop.

In option markets, if we consider that the prices are governed by an

econometric model such as an option pricing model by using Black-

Scholes formula, then we can use the price to obtain the implied volatil-

ity. The implied volatility is derived under the assumption that the price

of the underlying asset follows a Geometric Brownian Motion and that

might be different from the actual volatility. The volatility index (VIX)

of a market has become a financial instrument. The VIX volatility index

compiled by the Chicago Board of Option Exchange (CBOE) is an im-

plied volatility, started to trade in futures on March 26, 2004.1 One char-

acteristic of financial time series is that they suffer from heteroskedas-

ticity, but we can solve this problem using the ARCH (Autoregressive

Conditional Heteroskedasticity) model which capture this characteris-

tic of heteroskedasticity by conditioning variance. The implementation

of the ARCH model is the GARCH (generalized ARCH) providing a better

model, more complex but also more parsimonious. The GARCH model

is a non linear statistical model for time series that describes the vari-

ance of the current term or innovation as a function of the actual sizes

of the previous time periods error terms. It is widely used in modelling

and forecasting volatility. The GARCH model depends only on informa-

tion available on the time series itself and not on any exogenous infor-

1
See, Ruey S. Tsay. (2010).[1]

8
mation.2 . The idea of the new model is to use the GARCH model with

the implied volatility as a exogenous variable. The use of the VIX with

the GARCH is because the volatility index, seen as a general fear index

for a stock market, has relevant information due to globalization and

interdependence. Intuitively it is reasonable to consider VIX as an ad-

ditional variable in our forecast exercise. This is because VIX is defined

as a benchmark of expected short-term market volatility and provides a

forward-looking measure of volatility. The research and the analysis of

the relationship between the volatility of prices and their returns cause

many debates between Econometrics, Finance and Statistics Scholars.

The uncertainty of the future price is hard to forecast and the better fore-

cast of the future is analysing the past time series. Anyway, only an anal-

ysis of the past time series is not enough to forecast the future price be-

cause there are many others variables that influence the market like the

irrationality of the financial market participants, macro and micro eco-

nomics data and the financial market as a whole.

2
See, Vega Ezpeleta. (2015).[2]

9
2 Chapter 1: The structure of the model

2.1 Time Series

Financial data are usually presented as a set of observations y1 , y2 , ...,yi ,

..., yn where i is a time index. This kind of data are usually referred as a

time series.3 The analysis of time series is essential when we fit a model

that explain how the observed data works. The time series is defined

as the realization of a finite stochastic process, therefore a sequence of

observations of some quantity or quantities taken over time. When we

observe a time series, the fluctuations appear random, but often with

the same type of stochastic behaviour from one time period to the next.

One of the most useful methods for obtaining parsimony in a time se-

ries model is to assume stationarity. Weak stationarity means that mean,

variance, and covariance are unchanged by time shifts. Thus, the mean

and variance do not change with time and the correlation between two

observations depends only on the lag, the time distance between them.

A time series {yt } is said to be strictly stationary if the joint distribution

of (yt 1 , ....., yt k ) is identical to that of (yt 1+t , ....., yt k +t ) for all t, where k is

an arbitrary positive integer and (t 1 , .... t k ) is a collection of k positive

integers. In other words, strict stationarity requires that the joint distri-

bution of (yt 1 , ...., yt k ) is invariant under time shift. This is a very strong

condition that is hard to verify empirically. So, usually is enough to have

a weakly stationary time series when the both mean of yt and the covari-
3
See, Pastore (2018).[3]

10
ance between yt and yt −ι are time invariant, where ι is an arbitrary in-

teger. Therefore E(yt ) =µ, which is a constant, and Cov(yt , yt t − ι) = λι .

When we analyse a time series, we can never be absolutely certain

that it is stationary but there are many different tests to evaluate if the

time series is stationary or not. Otherwise, there are model fitted to

time series data either to better understand the data or to predict fu-

ture points in the series. Usually, a good approach is to use the ARMA

(autoregressive integrated moving average) model applying the differ-

encing step and modifying the parameters to explain a stationary and

ergodic time series. The model can be implemented with seasonality

and/or with other variants. The stationarity aids to evaluate the time se-

ries because the statistical properties of the process do not change over

time and consequently is much easier to model and investigate. Any-

way, different models has constraints or tests to evaluate and confirm

that the time series is stationary.

Many statistical models, analysing the time series, assume that a ran-

dom sample comes from a normal distribution. It well known that from

empirical evidences the time series follows a leptokurtic form (knowing

that the time series randomly display outliers) therefore to use the nor-

mal distribution could be an erroneous fit. Therefore, is necessary to

investigate how the distribution of the data differs from a normal distri-

bution and evaluate it.4

4
See, Ruppert D., Matteson D.(2015).[4]

11
2.2 Volatility

Financial markets can move quite dramatically both up and down

and the stock prices may appear too volatile to be justified by changes in

fundamentals. Volatility as a concept as well as a phenomenon remain a

hot topic to modern financial markets and academic research. Justified

volatility can form the basis for efficient price discovery, as the relation-

ship between risk and return, while volatility dependence implies pre-

dictability, which is welcomed by traders and medium-term investors.

Equilibrium prices, obtained from asset pricing models (CAPM), are af-

fected by changes in volatility, investment management lies upon the

mean-variance theory, while derivatives valuation joints upon reliable

volatility forecasts. The stockholders as the portfolio managers, risk ar-

bitrageurs and corporate treasurers closely watch volatility trends, as

changes in prices could have a major impact on their investment and

risk management decisions. the information are asymmetric on condi-

tional volatility and early evidences uveils that bad news in the Futures

market increase volatility in the cash markets more than good news. The

asymmetry is documented as "Leverage effect". The leverage hypothe-

sis is not the only force behind asymmetries but many others may well

contribute to the rise of asymmetries as noise trading, high-frequency

trading, flash crash and irrational behaviour.

Tipically, as a measure of volatility is used the standard deviation of


5
See, J. M. Bland, D. G. Altman (1996).[23]

12
5
the returns between daily close prices of instruments. In the math-

ematical approach the standard deviation is simply the square root of

the variance. The square root of the variance is easier to interpret be-

cause is restored to the original data, while the variance is no longer in

the same unit of measurement as the original data. Therefore, the stan-

dard deviation is a measure of the amount of variation or dispersion of

a set of values. A high standard deviation indicates than the values are

spread out over a wide range increasing the risk, while a low standard

deviation means that the values tend to be close to the mean of the set.

The standard deviation is often used as a measure of risk associated with

the fluctuations of the price of a given asset or, more globally, the risk of

a portfolio assets. Portfolio managers evaluate accurately the risk be-

cause is an important factor in determining how the efficiently manage

a portfolio of invetsments. Analyze the risk determines the variation in

returns on an asset or portfolio and gives stockholders a mathematical

basis for investment decisions. As the capital asset pricing model, if the

risk increases, the expected return on an investment should increase as

well, an increase known as the risk premium. So, if the investment car-

ries a higher level of risk or uncertainty stockholders expected an higher

return. In substance, the uncertainty is compensated with the return.

When evaluating investments, stockholders should estimate both the

expected return and the uncertainty of future returns. Standard devia-

tion provides a quantified estimate of the uncertainty of future returns.

13
Let X be a random variable with mean value µ :

E [X ] = µ (1)

Here the operator E denotes the average of expected value of X . Then

the standard deviation of X is the quantity:

Æ
σ= E [(X − µ)2 ] (2)

Æ
= E [X 2 ] + E [−2µX ] + E [µ2 ] (3)

Æ
= E [X 2 ] − 2µE [X ] + µ2 (4)

Æ
= E [X 2 ] − 2µ2 + µ2 (5)

Æ
= E [X 2 ] − µ2 (6)

p
= E [X 2 ] − (E [X ])2 (7)

In other words, the standard deviation σ (sigma) is the square root

of the variance of X. The standard deviation of a univariate probability

distribution is the same as that of a random variable having that dis-

tribution. Anyway, not all random variables have a standard deviation,

14
since these expected values need not exist.

2.3 Volatility Clusters

As documented in Bollerslev (1987)6 the general conclusion to emerge

from most of many studies is that price changes and rates of return are

approximately uncorrelated over time and well described by a unimodal

symmetric distribution with fatter tails than the normal, However, even

though the time series are serially uncorrelated, they are not indepen-

dent. As noted by Manderlbrot7 Hinich and Patterson8 ,

"..., large changes tend to be followed by large changes, of either sign,

and small changes tend to be followed by small changes,..."

This behaviour might very well explain the rejection of an indepen-

dent increments process for daily stock returns in Hinich and Patterson.

The model of this thesis works when there is volatility clustering there-

fore when there is period of higher, and of lower, variation within each

series. Volatility clustering does not indicate a lack of stationarity but

rather can be viewed as a type of dependence in the conditional vari-

ance of each series. As such, the variance of daily returns can be high

one month and show low variance the next. This occurs to such a de-

gree that it makes and i.i.d. (independent and identically distributed)

model of log-prices or asset returns unconvincing. The market respond


6
See, Bollerslev(1987).[5]
7
See, Manderlbrot(1963).[6]
8
See, Hinich and Patterson(1985).[7]

15
to new information with large price movements, these high-volatility

environments tend to ensure for a while after that first shock. In other

words, when a market suffers a volatile shock, more volatility should be

expected. The volatility clustering is a non-parametric property mean-

ing that there is not a request that the data being analyzed meet cer-

tain assumptions, or parameters.9 A standard way to remove volatility

clusters is by modeling returns with a GARCH(1,1) specification, model

delevoped in Bollersvel (1986).10 The conditional error distribution is

normal, but with conditional variance equal to a linear function of past

squared errors. Thus, there is a tendency for extreme values to be fol-

lowed by other extreme values, but of unpredictable sign.

2.4 Why does volatility change?

The models focus on providing a statistical description of the time-

variation of volatility, but does not go into depth on why volatility varies

over time. A number of explanations have been proffered to explain this

phenomenon, although treated individually, none are completely satis-

factory.

• News Announcements: The arrival of unexpected news forces stock-

holders to update beliefs, changing the strategies of the investment

or modifying the weight of the asset portfolio. Many institutional

investors rebalance his portfolios and high periods of volatility cor-


9
See, Rama Cont(2005).[8]
10
See, Bollerslev(1986).[9]

16
respond to stockholders dynamically solving for new asset prices.

Additionally, news-induced periods of high volatility are generally

short and the apparent resolution of uncertainty is far too quick to

explain the time-variation of volatility seen in asset prices.

• Leverage: When firms are financed using both debt and equity, only

the equity will reflect the volatility of the firms cash flows. However,

as the price of equity falls, a smaller quantity must reflect the same

volatility of the firm’s cash flows and so negative returns should lead

to increases in equity volatility.

• Volatility Feedback: Volatility feedback is motivated by a model where

the volatility of an asset is priced. When the price of an asset falls,

the volatility must increase to reflect the increased expected return

(in the future) of this asset, and an increase in volatility requires an

even lower price to generate a sufficient return to compensate an

investor for holding a volatile asset.

• Illiquidity: Intuitively, if the market is oversold, a small negative

shock will cause a small decrease8 in demand. However, since there

are few participants willing to buy (sell), this shock has a large ef-

fect on prices. If the market is overbought, a small positive shock

will cause a small decrease in demand but also here there will be a

large effect on prices if there are few participants.

• State Uncertainty: When the state is uncertain, like the actual situa-

17
tion due to Covid-19, slight changes in beliefs may cause large shifts

in portfolio holdings which in turn feedback into beliefs about the

state.

The actual cause of the time-variation in volatility is likely a combi-

nation of these and some not present.11

2.5 Univariate Models

In Univariate models the volatility is a scalar (ht ) because it works

only with an unique variable, while in the multivariate models is repre-

sented with a symmetric square matrix, positively semi-defined with a

dimension equal to the variables analyzed. Although a univariate time

series data set is usually given as a single column of numbers, time is

in fact an implicit variable in the time series. The market volatility is a

latent variable meaning that it is not directly observable, unlike market

price. If prices fluctuate a lot, we know volatility is high, but we cannot

ascertain precisely how high. Therefore the volatility must be forecast

by a statistical model, a process that inevitably entail making strong as-

sumptions. Indeed, volatility modelling is quite demanding, and often

seems to be as much an art as a science because of challenges posed

by the presence of issues such a nonnormalities, volatility clusters and

structural breaks.

The presence of volatility clusters suggests that it may be more effi-


11
See, Sheppard(2020)[10]

18
cient to use only the most recent observations to forecast volatility, or

assign a higher weight to the most recent observations.

The most commonly used models are:

1. Moving average (MA).

2. Exponentially weighted moving average (EWMA).

3. GARCH and its extension models.

4.Stochastic volatility.

5. Implied volatility.

6. Realized volatility.

Our model is an hybrid model that combine the GARCH including

the implied volatility using the VIX Index as a exogenous variable. We

usually assume the mean return is zero. While this is obviously not cor-

rect, the daily mean is orders of magnitude smaller than volatility and

therefore can usually be safely ignored for the purpose of volatility fore-

casting. Conditional volatility, σt , is typically, but not always, obtained

from application of a statistical procedure to a sample of previous return

observations, making up the estimation windows. Such methodologies

provide conditional volatility forecasts, represented by:

σt | past returns and a model = σ(yt −1 , ..., yt −WE )

where various methods are used to specify the function σ(·). 12

12
See, Danielsson(2011).[11]

19
2.6 Moving Average models

The most obvious and easy way to forecast volatility is simply to cal-

culate the sample standard error from a sample of returns. Over time,

we would keep the sample size constant, and every day add the newest

return to the sample and drop the oldest. This method is called the Mov-

ing Average (MA) model. The observations are equally weighted, which

is problematic when financial returns exhibit volatility clusters, since

the most recent data are more indicative of whether we are in a high-

volatility or low-volatility cluster.

2.7 EWMA model

The MA model can be improved by exponentially weighting returns,

so that the most recent returns have the biggest weight in forecasting

volatility. The best known such model is the exponentially weighted

moving average (EWMA) model.

σ̂2t = (1 − λ)yt2−1 + λσ̂2t −1 (8)

where 0 < λ < 1 is the decay factor, σ̂2t the conditional volatility fore-

cast on day t. This model si not optimal because use only a λ that is

constant and identical for all assets. Obviously is not realistic that λ is

constant but can be implemented easily and using multivariate forms.13

13
See, Ruppert D., Matteson D.(2015).[4]

20
2.8 The ARCH/GARCH Model

Our model use the GARCH (Generalised Autoregressive Conditional

Heteroskedasticity) model with the implementation of VIX. First of all, I

explain what is the GARCH model and why it is used. The model is useful

for the volatility clustering, therefore when the market goes through pe-

riods of high volatility and other periods when volatility is low. The vari-

ance of this model is not constant and use the conditional volatility, de-

fined as volatility in a given time period, conditional on what happened

before. These models are based on using optimal exponential weight-

ing on historical returns to obtain a volatility forecast. The first such

model was the Autoregressive Conditional Heteroskedasticity (ARCH)

proposed by Engle (1982) 14 but the generalized ARCH model (GARCH)

by Bollerslev(1986)[9] is the common denominator for most volatility

models. The GARCH framework builds on the notion of volatility depen-

dence to measure the impact of last period’s forecast error and volatility

in determining current volatility. Returns on day t are a function of re-

turns on previous days, where older returns have a lower weight than

more recent returns. The parameters of the model are typically esti-

mated with the maximum likelihood. We want to study the statistical

properties of returns given information available at time t-1 and create

a model of how statistical properties or returns evolve over time.

The principle of the model is the conditional volatility of the ran-

14
see, Engle(1982).[13]

21
dom variables Yt and is useful separate estimation of the mean from the

volatility estimation to use a more efficient model. So, the E(Yt ) is equal

to zero, called de-meaned. The return on day t can be indicated by

Yt = σt εt εt ∼ N (µ, σ2t ) (9)

In this case, the distribution of εt is normal but there is no need to

make any further assumptions about the distribution because is possi-

ble change the distribution that fit the model, like the Student-t.

2.8.1 ARCH Model

The ARCH model is a Autoregressive Conditional Heteroskedasticity

model therefore the variance change over time in a time series.

p
X
σ2t =ω+ αi Yt 2−i (10)
i =1

where p is the number of lags. Setting the lag to one in the formula

will result in the ARCH(1) model which states that the conditional vari-

ance of today’s return is equal to a constant, plus yesterday’s return squared;

that is:

σ2t = ω + αYt 2−1 (11)

22
The unconditional volatility of the ARCH(1) model is given by:

ω
σ2t = (12)
1−α

The most common distributional assumption for residuals εt ∼ N (0, 1)

In this case, conditional returns are conditionally normal. However, the

unconditional distribution of the return will be fat, easily demonstrated

by showing that unconditional excess kurtosis exceeds zero:

E (Y 4 )
K ur t osis = (13)
(E (Y 2 ))2

where, at the end, the unconditional kurtosis is:

3(1 − α2 )
K ur t osis = >3 i f 3α2 < 1. (14)
1 − 3α2

There are two main restrictions that are often imposed on the param-

eters of the ARCH model:

To ensure positive volatility forecasts for the ARCH:

∀i = 1, ......, p , αi , ω > 0 (15)

To ensure covariance stationarity so that unconditional volatility is

defines, impose:

p
X
αi < 1. (16)
j =1

23
It is only the nonnegativity constraint that always has to be imposed

and, depending on the final application, we may or may not want to

impose covariance stationarity. In case of the ARCH(1) model, if α ≥

1 the unconditional volatility is no longer defined, as is clear from the

unconditional volatility of ARCH(1).

One of the biggest problems with the ARCH model concerns the long

lag lengths required to capture the impact of historical returns on cur-

rent volatility. By including lagged volatility during ARCH model cre-

ation, it has the potential to incorporate the impact of historical returns.

The result is a GARCH model.

2.8.2 The GARCH model

Therefore, the GARCH (p,q) model is:

p
X q
X
σ2t =ω+ αi Yt 2−i + β σ2t − j (17)
i =1 j =1

where p,q are the numbers of lags. Setting the lag to one means that

the model is the GARCH(1,1). The most common version of the GARCH

is the GARCH (1,1) and Akgiray (1989)15 demonstrated that a GARCH(1,1)

model is sufficient to capture all volatility clustering.

The unconditional volatility of the GARCH(1,1) is given by:

σ2t = E (ω + αYt 2−1 + β σ2t −1 ) = ω + ασ2 + β σ2 (18)


15
See, Akgiray (1989).[14]

24
where

σ2t = ω + ασ2 + β σ2 (19)

So,
ω
σ2t = (20)
1−α−β

There are two main restrictions that are often imposed on the param-

eters of the GARCH model and are:

- to ensure positive volatility forecasts ω, α, β > 0

- to ensure covariance stationarity α + ω < 1

Therefore, unconditional variance is infinite when α + β = 1 and un-

defined when α + β > 1. We should not impose the constraint when all

we need is a forecast of conditional volatility, but it is necessary to pre-

dict unconditional volatility.

2.9 Stochastic Volatility

The stochastic volatility models are those in which the variance of

a stochastic process is itself randomly distributed. The name derives

from the model’s treatment of the underlying security’s volatility as a

random process, governed by state variables such as the price level of

the underlying security, the tendency of volatility to revert to some long-

run mean value, and the variance of the volatility process itself, among

others. The volatility process is a function of an exogenous shock as well

as past volatilities, so the process σt is itself random, with an innovation

25
term that is not known at time t. However, these models cannot explain

long-observed features of the implied volatility surface such as volatility

smile and skew, which indicate that implied volatility does tend to vary
16
with respect to strike price and expiry.

2.10 The VIX, the Index of the implied volatility

In 1993, CBOE Global Markets, Incorporated introduced the CBOE

Volatility Index (VIX), which was originally designed to measure the mar-

ket’s expectation of 30-day volatility implied by at-the money S&P 100

index option prices. in 2003, CBOE together with Goldman Sachs, up-

dated the VIX index to reflect a new way to measure expected volatility,

one that continues to be widely used by financial theorists, risk man-

agers and volatility traders alike. The new VIX index is based on the

S&P 500 Index, the core index for U.S. equities, and estimates expected

volatility by aggregating the weighted prices of S&P 500 puts and calls

over a wide range of strike prices. In 2014, Cboe enhanced the VIX in-

dex to include series of S&P 500 Weekly’s. It allows the VIX index to be

calculated with S&P 500 index option series that most precisely match

the 30-day target timeframe for expected volatility that the VIX index is

intended to represent. This extensive data set provides investors with a

useful perspective of how option prices have behaved in response to a

variety of market conditions.


16
See, Gatheral (2006).[12]

26
The VIX is a volatility index comprised of options rather than stocks,

with the price of each option reflecting the market’s expectation of fu-

ture volatility. Like conventional indices, the VIX Index calculation em-

ploys rules for selecting component options and a formula to calculate

index values. Some different rules and procedures apply when calculat-

ing the VIX index value to be used for the final settlement value of VIX

futures and options.

More specifically, the VIX index is intended to provide an instantaneous

measure of how much the market expects the S&P 500 Index will fluctu-

ate in the 30 days from the time of each tick of the VIX Index.

Intraday VIX Index values are based on snapshots of SPX option bid/ask

quotes every 15 seconds and are intended to provide an indication of the

fair market price of expected volatility at particular points in time. VIX

is the most popular indicator of volatility and has been regarded as the

world’s premier barometer of investors’ sentiment and market volatility.


17
Implied volatility, as measured by VIX, reflects expectations, hence

its market name as "the fear gauge". In general, VIX, starts to rise dur-

ing time of financial stress and lessens as investors become complacent.

Thus, the greater the fear, the higher the volatility index would be.

17
see CBOE VIX (2019).[15]

27
2.10.1 VXO

The first VIX was renamed VXO, the old VIX index based on the Black-

Scholes-Merton implied volatility of S&P 100 options. To construct the

old VIX, two puts and two calls for strikes immediately above and be-

low the current index are chosen. Near maturities (greater than eight

days) and second nearby maturities are chosen to achieve a complete

set of eight options. By inverting the BSM pricing formula using cur-

rent market prices, an implied volatility is found for each of the eight

options. An iterative search procedure can be used to find the implied

σ. These volatilities are then averaged, first the puts and the calls, then

the high and low strikes. Finally, an interpolation between maturities

is done to compute a 30 calendar day (22 trading day) implied volatility.

Because the BSM model assumes the index follow a geometric Brownian

motion with constant volatility, when in fact it does not, the old VIX will

only approximate the true risk-neutral implied volatility over the com-

ing month. In reality the price process is likely more complicated than

geometric Brownian motion.18

The Black-Scholes-Merton model and its extensions assume that the

probability distribution of the underlying asset at any given future time

is lognormal. This assumption is not the one made by traders. They

assume the probability distribution of an equity price has an heavier left

tail and a less heavy right tail than the lognormal distribution. Traders

18
See, McAleer (2007).[24]

28
use volatility smiles to allow for non-lognormality. The volatility smile

defines the relationship between the implied volatility of an option and


19
its strike price.

The implied volatility used to create the VXO are as follow:

Exercise price Nearby contract(1) second Nearby contract(2)

Call Put Call Put

X X X X
X l (< S ) σc ,1l σp ,1
l
σc ,2l σp ,1
l

X X X X
X u (> S ) σc ,1u σp ,1u σc ,2u σp ,2u

where the X l is the strike below the price of the current index, S is

the share price of the index at maturity (higher than 8 days) and the X u

is the strike above the price of the current index. In the table there are

their relative options.

The first step is to average the put and call implied volatilitIes for each

strike and maturity to reduce the number of volatilitIes to 4. Compute:

X X X X X X X X
X
(σc ,1l + σp ,1
l
) X
(σc ,1u + σp ,1u ) X
(σc ,2l + σp ,2
l
) X
(σc ,2u + σp ,2u )
σ1 l = ; σ1 u = ; σ2 l = ; σ2 u =
2 2 2 2
(21)

Now average the implied volatility above and below the index level as
19
Hull (2018).[16]

29
follows:

Xu − S S − Xl Xu − S S − Xl
 ‹  ‹  ‹  ‹
X Xu X Xu
σ1 = σ1 l +σ1 ; σ2 = σ2 l +σ2
Xu − Xl Xu − Xl Xu − Xl Xu − Xl
(22)

The final step is calculating the VXO is to interpolate between the two

maturities to create a 30 calendar day (22 trading day) implied volatility

index.

Nt 2 − 22 22 − Nt 2
 ‹  ‹
V X O = V I X o l d = σ1 + σ2 (23)
Nt 2 − Nt 1 Nt 2 − Nt 1

where Nt 1 and Nt 2 are the number of trading days to maturity of the


20
two contracts.

2.10.2 VIX as a Variance Swaps

The new VIX is coNstructed as a Variance Swaps. Three important

changes are being made to update and improve VIX. The new VIX is cal-

culated using a wide range of strike prices in order to incorporate infor-

mation from the volatility skew. The VXO used only at-the money op-

tions. The new VIX uses a newly developed formula to derive expected

volatility directly from the prices of a weighted strip of options. The VXO

extracted implied volatility from an option-pricing model (BSM model).

The new VIX uses options on the S&P 500 Index, which is the primary

U.S. stock market benchmark. So, the new VIX provide a more precise
20
see, HaoNewVix, Hao Zhou and Matthew Chesnes. (2003).[19]

30
and robust measure of expected market volatility and to create a viable

underlying index for tradable volatility products. The VIX is more practi-

cal and simpler because it uses a formula that derives the market expec-

tation of volatility directly from index option prices rather than an algo-

rithm that involves backing implied volatility out of an option-pricing

model.

We compute the VIX starting from a Markov process. A Markov pro-

cess is a particular type of stochastic process where only the actual value

of a variable is relevant to forecasts the future. The past history of the

variable and the way that the present has emerged from the past is irrel-

evant. Predictions for the future are uncertain and must be expressed

in terms of probability distributions. The Markov property implies that

the probability distribution of the price at any particular future time is

not dependent on the particular path followed by the price in the past.

We consider a particular type of Markov stochastic process, the Wiener

process, with a mean change of zero and a variance rate of 1 per year

expressed formally, a variable z follows a Wiener process if it has the fol-

lowing two properties:


p
Property 1. The change ∆z = ε ∆t

where ε has a standard normal distribution φ(0,1).

Property 2. The values of ∆z for any two different short intervals

of time, ∆t, are indipendent.

It follows from the first property that ∆z itself has a normal distribution

31
with

mean = 0
p
standard deviation = ∆t

The second property implies that z follows a Markov process.

Consider the change in the value of z during a relatively long period

of time, T. This can be denoted by z(T) - z(0). It can be regarded as the

sum of the changes in z in N small time intervals of length ∆t, where:

T
N=
∆t

Thus,

N
X p
z (T ) − z (0) = εi ∆t (24)
i =1

where the ε (i= 1,2,....N) are distributed φ(0,1). We know from the sec-

ond property of Wiener processes that the εi are independent of each

other. It follows that z(T) -z(0) is normally distributed, with mean equal
p
to zero and standard deviation T .

The d means that the value y in ∆ is in the limit as ∆ y → 0.

The mean change per unit time for a stochastic process is known as

the drift rate and the variance per unit time is known as the variance

rate. So, a generalized Wiener process for a variable x can be defined in

32
terms on d z as

d x = ad t + bd z (25)

where a and b are constants.

The a d t term implies that x has an expected drift rate of a per unit of

time. The b d z term can be regarded as adding noise or variability to the

path followed by x . The amount of this noise or variability is b times a

Wiener process. A Wiener process has a variance rate per unit time of

1.0. It follows that b times a Wiener process has a variance rate per unit

time of b 2 .21

In a small time interval ∆t, the change ∆x in the value of x is given as

p
∆x = a ∆t + b ε ∆t (26)

where, as before, ε has a standard normal distribution φ(0,1). Thus

∆x has a normal distribution with mean = a ∆t and standard deviation


p
= b ∆t .

Similar arguments to those given for a Wiener process show that the

change in the value of x in any time interval T is normally distributed


p
with mean = a T standard deviation =b T .

To summarize, the generalized Wiener process has an expected drift

rate of a and a variance rate of b 2 .

Now, we discuss the stochastic process usually assumed for the price

of a non-dividend-paying stock. The stock price follows a generalized


21
See, Reto R. Gallati, (2003).[22]

33
Wiener process but the assumption of constant expected drift rate is in-

appropriate and needs to be replaced by the assumption that the ex-

pected return is constant. If S is the stock price at time t, then the ex-

pected drift rate in S should be assumed to be µS for some constant pa-

rameter µ. This means that in a short interval of time, ∆t, the expected

increase in S is µS∆t. The parameter µ is th expected rate of return on

the stock. The uncertainty of the process, the standard deviation, should

be proportional to the stock price and therefore [16]:

dS
= µd t + σd z (27)
S

The variable µ is the stock’ expected rate of return. The variable σ

is the volatility of the stock price. The model represents the stock price

process in the real world.

Applying Ito’s formula, it get:

σ2
d (l nSt ) = (µ − )d t + σd Z t (28)
2

By subtracting these two equations it obtain

d St σ2
− d (l nSt ) = dt (29)
St 2

Integrating between time 0 and time T, the realized average variance

34
rate, V¯ , between time 0 and time T is given by

T T
Z ‚Z ‹Œ
1 2 d St St

V¯ = σ2 d t = −ln (30)
T 0
T 0
St S0

Taking expectations in a risk-neutral world

2 F0 2 St
 ‹
Eˆ (V¯ ) = l n − Eˆ l n (31)
T S0 T S0

Where F0 is the forward price of the asset for a contract maturing at

time T.

Consider

Z S∗
1
ma x (K − ST , 0)d K (32)
K =0
K 2

for some value S ∗ of S. When S ∗ < ST this integral is zero. When S ∗ >

ST it is

Z S∗
1 S ∗ ST
(K − ST , 0)d K = l n + −1 (33)
K =ST
K2 ST S ∗

Consider next

Z ∞
1
ma x (ST − K , 0)d K (34)
K =S ∗
K2

When S ∗ > ST this is zero. When S ∗ < ST it is

Z ST
1 S ∗ ST
(ST − K , 0)d K = l n + −1 (35)
K =S ∗
K2 ST S ∗

35
From these results it follows that

Z S∗ Z ∞
1 1 S ∗ ST
ma x (K −ST , 0)d K + ma x (ST −K , 0)d K = l n + −1
K =0
K 2
K =S ∗
K2 ST S ∗
(36)

for all value of S ∗ so that

Z S∗ Z ∞
ST ST 1 1
ln = −1− ma x (K −ST , 0)d K − ma x (ST −K , 0)d K
S∗ S∗ K =0
K 2
K =S ∗
K2
(37)

This shows that a variable that pays off in ST can be replicated using

options. This result can be used in conjunction with equation V¯ (1) to

provide a replicating portfolio for V¯ .

Taking expectation in a risk-neutral world

Z S∗ Z∞
ST F0 1 RT 1 RT
 ‹
Eˆ l n = −1− e P (K )d K − e C (K )d K (38)
S∗ S∗ K =0
K 2
K =S ∗ K 2

where C(K) and P(K) are the prices of European call and put options

with strike price K and maturity T and R is the risk-free interest rate for

a maturity of T. Combining equations 31 and for 38 and noting that

ST S∗ ST
 ‹  ‹
Eˆ l n = + Eˆ l n (39)
S0 S0 S∗
36
•Z S ∗
2 F0 2 S ∗ 2 F0 2 1 RT
• ˜
Eˆ (V¯ ) = l n − l n − −1 + e P (K )d K +
T S0 T S0 T S ∗ T K =0 K 2

Z ∞
1 RT
˜
+ e C (K )d K (40)
K =S ∗
K2

which reduces to

•Z S ∗ Z∞
2 F0 2 F0 2 1 RT 1 RT
• ˜ ˜
Eˆ (V¯ ) = l n − −1 + e P (K )d K + e C (K )d K
T S∗ T S∗ T K =0 K 2 K =S ∗
K 2

(41)

This result is the Variance Swaps where we find the VIX formula. The

Variance swaps offer straightforward and direct exposure to the volatil-

ity of an underlying such as a stock or index. They are swap contracts

where the parties agree to exchange a pre-agreed variance level for the

actual amount of variance realised over a period.

The strike of a variance swap, not to be confused with the strike of an

option, represents the level of volatility bought of sold and is set at trade

inception. The strike is set according to prevailing market levels so that

the swap initially has zero value. If the subsequent realised volatility is

above the level set by the strike, the buyer of a variance swap will be

in profit; and if realised volatility is below, the buyer will be in loss. A

buyer of a variance swap is therefore long volatility. Similarly, a seller of

a variance swap is short volatility and profits if the level of variance sold

37
(the variance swap strike) exceeds that realised.

The l n function can be approximated by the first two terms in a series

expansion:

‹2
F0 F0 1 F0
 ‹  ‹ 
= −1 − −1 (42)
S∗ S∗ 2 S∗

This means that the risk-neutral expected cumulative variance is cal-

culated as

2 X ∆K 1
•
F
˜2
i
Eˆ (V¯ )T = σ2 = V I X 2 = 2
e RT Qi (K i ) − −1 (43)
T i Ki T K0

Where:

σ *100 = V I X

T =Time to expiration

F = Forward index level derived from index option prices F = K 0 +e RT (C0 −

P0 )

K 0 = First strike below the forward index level, F

K i = Strike price of i t h out-of-the-money option, a call if K i > K 0 and a

put if K i < K 0 , both put and call if K i = K 0 .

∆K i = interval between strikes prices -half the difference between the


K i +1 −K i −1
strike on either side of K i : 2

R = Risk free interest rate to expiration

Q (K i ) The midpoint of the bid-ask spread for each option with strike K i .

38
So, the constant 30-day volatility index VIX:

v
u§ NT2 − N30 N30 − NT1 N365
 ‹  ‹ª
V I X = 100 ∗ T1 σ1
2
+ T2 σ2
2
∗ ; (44)
t
NT2 − NT1 NT2 − NT1 N30

Where NT is the number of minutes.

2.10.3 Interpretation of the VIX

The VIX is listed in percentage points and represents the expected

range of movement in the S&P 500 index over the next month, at a 68%

confidence level (one standard deviation of the normal probability dis-

tribution). For example, if the VIX is 30, this represents an expected an-

nualized change, with a 68% probability, of less than 30% up or down.

The expected volatility range for a single month can be calculated by di-

viding the VIX by (12) which would imply a range of +/- 8.67% over
p

the next 30-day period. Similarly, expected volatility for a week would

be 30 divided by (52), or +/- 4.16%. The VIX uses calendar day annu-
p

alization so the conversion of 30% is 30 divided by (365), or +/- 1.57%


p

per day. The calendar day approach does not account for the number of

when the financial markets are open in a calendar year. In the financial

markets trading days typically amount to 252 days out of a given calen-

dar year.

39
2.10.4 Alternatives of the VIX

The VIX is not the only volatility index but there are many different in-

strument calculated according to a variance swap stile calculation (VIX

for S&P 500, VXN for the Nasdaq, VSTOXX for the Euro Stoxx 50, VDAX

for DAX and VSMI for the SMI). They represent the theoretical level of

a rolling 1-month (30 calendar day) maturity variance swap, based on

traded option prices. In fact, theoretical variance swap levels are first

calculated for listed option maturities, and then the 30-day index level

is interpolated. So, each index represents the risk-neutral expected vari-

ance of the underlying over the next month.

2.11 Realized Volatility

Realized volatility measures what actually happened in the past and

is based on taking intraday data, sampled at regular intervals (e.g., ev-

ery 10 minutes), and using the data to obtain the covariance matrix.

The main advantage is that it is purely data driven and there is no re-

liance on parametric models. The downside is that intraday data need

to be available; such data are often difficult to obtain, hard to use, not

very clean and frequently very expensive. In addition, it is necessary

to deal with diurnal patterns in volume and volatility when using real-

ized volatility (i.e., address systematic changes in observed trading vol-

ume and volatility throughout the day). Moreover, the particular trading

platform in use is likely to impose its own patterns on the data. All these

40
issues complicate the implementation of realized volatility models.

2.12 Intro of the model

The objective is computing a model that combines the Generalised

Autoregressive Conditional Heteroskedasticity model (GARCH model)

with the Volatility Index (VIX) as a exogenous variable. We build a model

that use the historical volatility and the implied volatility using their re-

spective models. The GARCH model is useful when there is volatility

clustering and use the time series. The VIX is useful during periods of

high uncertainty and the value of the Index increase rapidly. The for-

mula of the VIX use the S&P 500 option prices with, on average, a 30

days maturity.

Now we compute the GARCH model adding as an exogenous variable

the VIX and we evaluate if this model increase the performance in the

empirical analysis. This model is a plain vanilla GARCH(1,1) model with

the VIX(1) as an exogenous variable in the variance equation. The model

is given by

σ2t = ω + αYt 2−1 + β σ2t −1 + δV I X t −1 (45)

Conditions for non-negative conditional variance are the same of the

GARCH with the implementation of the parameter δ > 0 that capture the

effect of the exogenous variable VIX.

41
Therefore,

α ≥ 0; (46)

β ≥ 0; (47)

α + β < 1; (48)

ω+δ >0 (49)

So, like in the GARCH model, all variables must be positive and the

variance cannot be negative. The new model created is an evolution of

the analysis of the conditional heteroskedasticity using an independent

variable implementing the reactions of the volatility and could be useful

to forecast volatility clustering. The GARCH model could be not enough

to fit a model in a time series because it analyses the historical variance

but not consider a derivative instruments. A GARCHVIX approach could

anticipate the volatility using the options as an exogenous indicator. We

evaluate the model with the EWMA and the GARCH model to determine

if the new model created is more accurate than the others already avail-

able. An empirical analysis can help to determine the accuracy of the

model.

42
3 Chapter 2: Empirical Research

3.1 Data

As a demonstration of the faithfulness of the model, we consider the

daily log returns series of each closing price for the 10 major global in-

dices, using as data source Bloomberg. I downloaded the Future con-

tracts of the following 10 equity indices:

DAX 30 (Germany)

CAC 40 (France)

FTSE 100 (England)

FTSE MIB (Italy)

S&P 500 (USA)

NASDAQ (USA - Tech)

HANG SENG (Hong Kong)

FTSE CHINA A50 (China)

TOPIX (Japan)

SPI 100 (Australia)

43
DAX30

CAC40

FTSE100
80

FTSEMIB

SP500

NASDAQ

HANGSENG

CHINA50

TOPIX
60

SPI100
40
% benefits

20
0
−20

0 200 400 600 800 1000 1200 1400

dates

Figure 1: Time series Indices

In selecting the data, I diversified from different markets around the

world in order to test the model with different time series, looking to

diversify and reduce the correlation among them. Every index is com-

posed with different sector and Iights and the comparison among them

could bend the truth entailing wrong assumptions. We downloaded Fu-

tures of the indices because they are highly liquid ensuring smaller trans-

action costs due to bid/ask spreads and an efficient asset pricing. The

44
period analyzed starts the 10, November 2014 and ends the 26, June 2020

with 1418 observations. I managed the data computing the logarithmic

daily returns of every time series and fitted data among them with the

VLOOKUP function of Excel. I have also used MATLAB and R studio to

manage and analyze the empirical research. So, for a better interpre-

tation of data graphs, the x axis show values starting from 0, concern-

ing the 10 October 2014, the 500 concerning the 10 October 2016, 1000

concerning the 29 October 2018 and to the ends the 1418 observation

concerning to the 26 June 2020.

Daily log returns of the indices

45
Figure 2: Logarithmic daily returns

As we can see from figure 2, the returns of the indices are plotted over

time and the volatility tends to vary. The differenced daily log returns

series show higher degree of volatility in the last 6 Months than relative

to the entire time series analyzed. This is happened due to Covid-19,

a pandemic virus that, as a black swan, destabilized the global markets

increasing the uncertainty and, therefore, the volatility. Transforming

data permits to remove the trend of the time series and make a time se-

46
ries stationary. The graphs show a different range of daily returns ex-

hibit different volatility among them, but the volatility is different also

inside every single time series showing a possible volatility clustering.

Usually the returns of time series are assumed to follow a normal dis-

tribution, but the empirical analysis show how assume the normality is

wrong and is almost impossible find data that roughly fits a bell curve

shape. A common characteristic of asset returns is that they have fat-

tailed distribution, presence of positive or negative skewness and suffer

from leptokurtosis.

3.2 Distribution of Returns

To determine the distribution there are many statistical tests useful

to determine if the returns follows a normal distribution or not. Any-

way, there are many parametric and non-parametric distribution useful

to fit the model in the returns. We computed the Jarque-Bera test that

analyze the normality distribution. It is a goodness-of-fit test of whether

sample data have the skewness and kurtosis matching a normal distri-

bution. A normal distribution has a skewness of 0 and a kurtosis of 3. If

the test statistics is far from zero, it signals the data do not have a normal

distribution. the Jarque-Bera statistic asymptotically has a chi-squared

distribution with 2 degrees of freedom, so the statistic can be used to test

the hypothesis that the data are from a normal distribution. the null hy-

pothesis is a joint hypothesis of the skewness being zero and the excess

47
kurtosis being zero. To avoid the risk to take to falsely reject H0 or do

not reject H0 is important to consider the p-value or significance level

alpha of the statistics. For example, if the p-value of a test statistic result

is 0.059, then there is a probability of 5.9% that I falsely reject H0 because

the p-value is higher than 0.05.

Table 1: Jarque-Bera test

X-squared df p-value
DAX 30 5313.1 2 0
CAC 40 10652 2 0
FTSE 100 8339.6 2 0
FTSE MIB 19397 2 0
S&P 500 21717 2 0
NASDAQ 8910.4 2 0
HANG SENG 1291.3 2 0
CHINA 50 10884 2 0
TOPIX 3381.6 2 0
SPI 100 12911 2 0

As we can see, the test exhibit high signals that the distribution is

not normal. The p-value is statistically significant to reject the null hy-

pothesis. We can investigate which is the empirical distribution and

how differs from a normal distribution. First of all, I compute the his-

togram of the time series, then I calculate the normal density function

with the sample mean and the sample standard deviation of the given

series (dashed line) and, last but not least, I compute the estimate of the

density from a sample (solid line).

48
50
50

40
40

30
Density

Density
30

20
20

10
10
0

0
−0.10 −0.05 0.00 0.05 0.10 −0.10 −0.05 0.00 0.05 0.10

DAX30 returns CAC40 returns

40
10 20 30 40 50

30
Density

Density

20
10
0

−0.10 −0.05 0.00 0.05 0.10 −0.15 −0.05 0.05 0.10

FTSE100 returns FTSEMIB returns

49
50
60

40
Density

Density

30
40

20
20

10
0

0
−0.10 −0.05 0.00 0.05 0.10 −0.10 −0.05 0.00 0.05 0.10

SP500 NASDAQ
10 20 30 40 50 60

40
30
Density

Density

20
10
0

−0.08 −0.04 0.00 0.04 −0.15 −0.05 0.05 0.15

HANGSENG CHINA50

50
10 20 30 40 50 60

50
40
Density

Density

30
20
10
0

−0.10 0.00 0.05 0.10 0 −0.10 0.00 0.05

TOPIX SPI100

Figure 3: Distribution analysis

So, we have the histograms that present the returns of the indices

showing outliers. Then, we have a dashed line that represent the nor-

mal distribution. Finally we have a solid line that represent the empir-

ical distribution, completely different from a normal distribution. The

figures shows that a normal distribution does not fits well the sample es-

timate of the density of returns for the Major Indices. With the Shapiro

test every p-value was less than a given confidence level (0.05) then the

null hypothesis was rejected for every time series. The t-distribution

have played an extremely important role in classical statistics because

of their use in testing and confidence intervals when the data are mod-

elled as having normal distributions. More recently, t-distributions have

51
gained added importance as models for the distribution of heavy-tailed

phenomena such as financial markets data. Anyway, using the Student-

t distribution might need at least 3000 observations and in our model I

have observed only 1418 observations.

3.3 Descriptive Statistics

Another approach to evaluate the time series is simply realize a table

that describe the statistics indicators. The descriptive statistics deter-

mine the mean, variance, maximum and minimum variables, skewness

and kurtosis. Moreover, I provided the Ljung-Box test with the p-value

and in autoror the five lags of the AutoCorrelation Function (ACF). The

Ljung-Box test is a type of statistical test of whether any of a group of au-

tocorrelations of a time series are different from zero. Instead of testing

randomness at each distinct lag, it tests the overall randomness based

on a number of lags, and is therefore a portmanteau test. The null hy-

pothesis of the Ljung-Box test is not rejected when the data are indepen-

dently distributed, and the critical region for rejection of the hypothesis

of randomness is determined with a chi-squared distribution with h de-

grees of freedom. (lags=degree of freedom).

The following table 1 shows the output for each time series.

52
Table 2: Descriptive statistics and Ljung-box test

DAX 30 CAC 40 FTSE 100 FTSE MIB S&P 500 NASDAQ HANG SENG CHINA 50 TOPIX SPI 100

mean 2.21E-04 1.30E-04 -4.69E-05 5.54E-05 2.78E-04 6.11E-04 1.43E-04 4.22E-04 1.69E-04 -2.18E-05

std 0.0131 0.013 0.0111 0.0152 0.0117 0.0132 0.012 0.0181 0.0125 0.0114

min -0.1175 -0.1322 -0.1008 -0.169 -0.1095 -0.1148 -0.0768 -0.1598 -0.0845 -0.1027

max 0.1006 0.0824 0.0829 0.0817 0.0935 0.0927 0.0557 0.1611 0.0883 0.0725

skewness -0.5952 -1.2946 -0.9968 -1.7525 -0.9001 -0.8143 -0.5607 -0.4358 -0.4123 -1.1113

kurtosis 12.4079 16.1753 14.7122 20.777 22.0872 15.1721 7.5384 16.5448 10.5202 17.6142

Ljung Box 0.0000 0.0000 0.0000 0.0000 1.0000 1.0000 1.0000 1.0000 0.0000 1.0000

P-value 0.2273 0.4217 0.8822 0.3527 1.41E-11 2.01E-14 0.0028 8.32E-05 0.122 1.44E-08

Stats 6.9114 4.9526 1.7524 5.5482 59.688 73.4015 18.0867 26.1569 8.6912 45.0106

autoror : : : : : : : : : :

lag1 0.0060 -0.0006 -0.0248 -0.0299 -0.1584 -0.1927 -0.0821 -0.0932 -0.0031 -0.1593

lag2 0.0679 0.0529 -0.0024 0.0462 0.0782 0.0653 0.0446 -0.0502 0.0187 0.0457

lag3 0.0137 0.0189 0.0205 0.0233 0.0232 0.0367 0.0265 -0.0005 -0.0583 0.0208

lag4 0.0056 -0.0003 -0.0137 -0.0067 -0.085 -0.0737 -0.0565 0.0724 -0.0341 -0.0268

lag5 -0.0001 -0.0183 0.0023 0.017 0.0561 0.0599 0.0105 -0.0446 -0.0347 0.0558

As we can see in the table, every time series shows negative skewness,

meaning longer left tails than a normal and asymmetrical distribution.

The kurtosis is high thus the data seems to have a leptokurtosis distri-

bution (fat-tailed), due to extreme outliers. The time series are simi-

lar about the skewness and the kurtosis following the fact that they are

similar in terms of structure of data. The Ljung-Box test (lbq test) has

as output the Hypothesis, the pvalue, the stats in terms of chi-squared

distribution and five autolags considered as h degrees of freedom. The

Ljung-Box test show that there is no serial dependence in the DAX 30,

CAC 40, FTSE 100, FTSE MIB indices and in the TOPIX. It is determined

because the test do not reject the null hypothesis, the p-value is higher

53
than alpha (the significance level) 0.05 and the critical region for rejec-

tion is lower than 11.07, determined in the chi-squared distribution ta-

ble. Otherwise, we reject the null hypothesis in the S&P 500, NASDAQ,

HANG SENG, CHINA 50 AND SPI 100 indices, showing how the observed

results are statistically significant, because the observed p-value is less

than the pre-specified significance level alpha. The t-statistics show that

the values are higher than the 11.07 and the ACF of lags 5 are autocorre-

lated. In these cases, we need to assume further assumptions when we

create a model.

3.4 Correlation

It has been frequently observed that USA markets leads other devel-

oped markets in Europe or Asia, and that at times the leader becomes

the follower. With different markets, some assets’ returns are observed

to behave like other assets’ returns, or completely opposite. The study of

the possible interdependence between two financial time series is cal-

culated with the correlation of their returns. The correlation is just a par-

ticular symmetric relation of dependency among stochastic variables,

and so to know the degrees of asymmetry in the dependence of one fi-

nancial time series with respect to another, whereby we may determine

who leads and who follows, we must study measures of causality.

Anyway, I calculated the correlation matrix among the indices and

there is evidence of high correlation between the European indices but

54
they present also high correlation with the USA markets.

Table 3: Correlation Matrix

DAX30 CAC40 FTSE100 FTSEMIB SP500


DAX 30 1 0.924282 0.820164 0.837192 0.582559
CAC 40 0.924282 1 0.849198 0.861904 0.617962
FTSE 100 0.820164 0.849198 1 0.743503 0.588729
FTSE MIB 0.837192 0.861904 0.743503 1 0.566213
S&P 500 0.582559 0.617962 0.588729 0.566213 1
NASDAQ 0.537395 0.565518 0.534707 0.501671 0.918291
HANG SENG 0.448886 0.484345 0.481356 0.394928 0.296158
CHINA 50 0.25078 0.267338 0.27243 0.185092 0.208006
TOPIX 0.355731 0.38675 0.347864 0.31926 0.212698
SPI 100 0.387 0.423385 0.459289 0.358352 0.417533

NASDAQ HANGSENG CHINA50 TOPIX SPI100


DAX 30 0.537395 0.448886 0.25078 0.355731 0.387
CAC 40 0.565518 0.484345 0.267338 0.38675 0.423385
FTSE 100 0.534707 0.481356 0.27243 0.347864 0.459289
FTSE MIB 0.501671 0.394928 0.185092 0.31926 0.358352
S&P 500 0.918291 0.296158 0.208006 0.212698 0.417533
NASDAQ 1 0.305527 0.210748 0.171838 0.355453
HANG SENG 0.305527 1 0.596897 0.491856 0.470971
CHINA 50 0.210748 0.596897 1 0.290561 0.262706
TOPIX 0.171838 0.491856 0.290561 1 0.440931
SPI 100 0.355453 0.470971 0.262706 0.440931 1

As we can see, the correlation increase with the status of the devel-

opment economics and also with the geographic areas.

55
Then, we can show the correlation between the indices and the VIX

to analyse how the Volatility index used show negative correlation with

the indices and the differences among them.

Table 4: Correlation with the VIX

VIX

DAX 30 -0.4246994

CAC 40 -0.4430257

FTSE 100 -0.3792228

FTSE MIB -0.4219787

S&P 500 -0.6659271

NASDAQ -0.6443102

HANG SENG -0.2205619

CHINA 50 -0.1266067

TOPIX -0.1659127

SPI 100 -0.1865044

In the table 4 every time series analyzed shows negative correlation

but is highly relevant in the Standard & Poor index and in the Nasdaq.

In the European Countries is relevant and is low in the others.

3.5 Squared Residuals

Then, after a general descriptive statistics, I focus on the heteroskedas-

ticity and an useful graphical analysis is check the squared residuals.

56
First of all, I plot a graph of the squared time series to analyze if exhibit

volatility clustering, then I evaluate the squared residuals using the Au-

tocorrelation function (ACF). We shown below the 10 analysed indices

in exam.

DAX30 Squared returns ACF


0.014

1.0
0.012

0.8
0.010

0.6
0.008
y1^2

ACF
0.006

0.4
0.004

0.2
0.002

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

CAC40 Squared returns ACF


1.0
0.015

0.8
0.6
0.010
y2^2

ACF

0.4
0.005

0.2
0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

57
FTSE 100 Squared returns ACF

1.0
0.010

0.8
0.008

0.6
0.006
y3^2

ACF

0.4
0.004

0.2
0.002

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

FTSE MIB Squared returns ACF


1.0
0.025

0.8
0.020

0.6
0.015
y4^2

ACF

0.4
0.010

0.2
0.005

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

58
SP500 Squared returns ACF
0.012

1.0
0.010

0.8
0.008

0.6
0.006
y5^2

ACF

0.4
0.004

0.2
0.002

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

NASDAQ Squared returns ACF


1.0
0.012

0.8
0.010
0.008

0.6
y6^2

ACF
0.006

0.4
0.004

0.2
0.002

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

59
HANG SENG Squared returns ACF
0.006

1.0
0.005

0.8
0.004

0.6
0.003
y7^2

ACF

0.4
0.002

0.2
0.001

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

CHINA50 Squared returns ACF


1.0
0.025

0.8
0.020

0.6
0.015
y8^2

ACF

0.4
0.010

0.2
0.005

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

60
TOPIX Squared returns ACF
0.008

1.0
0.8
0.006

0.6
0.004
y9^2

ACF

0.4
0.002

0.2
0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

SPI100 Squared returns ACF


1.0
0.010

0.8
0.008

0.6
0.006
y10^2

ACF

0.4
0.004

0.2
0.002

0.0
0.000

0 200 400 600 800 1000 1200 1400 0 5 10 15 20 25 30

Time Lag

Figure 4: Squared Residuals

As we can show, the last period is influenced by the Covid-19 and the

reaction of the time series were increase the volatility and then create a

61
volatility clustering. The period analysed show highest evidence in the

Covid period for many indices, excluding the HANG SENG index and the

TOPIX index. We noted that the volatility increase during economical

and political events like Brexit, USA elections and so on. The volatility

of the HANG SENG is the opposite of the other indices showing almost

always high volatility with a brief time of low volatility, given graphical

evidence of volatility clustering. The TOPIX index has almost the same

situation. Instead the CHINA A50 index shows high volatility during the

first period and now, with the Covid crisis, is lower than before. The

FTSE MIB has reached highest values of volatility during the last period

due to Covid-19. In the right side, there are their respective correlogram

showing that the lags, in all cases, decrease obtaining the autocorrela-

tion of the squared residuals less than 0.05, the significance level. This is

positive for the model because all time series shows volatility clustering

and the squared residuals tends to zero.

3.6 The Heteroskedasticity

Another empirical analysis to evaluate the heteroskedasticity is us-

ing the Engle’s ARCH test statistics. The test assesses the null hypoth-

esis that a series of residuals exhibits no conditional heteroskedasticity

(ARCH effects) against the alternative that an ARCH(p) model describes

the series. The h is the hypothesis. The value 1 indicates rejection of

the no ARCH effects null hypothesis in favour of the alternative. h=0 in-

62
dicates failure to reject the no ARCH effects null hypothesis. We have

used one lag. Then the output are pvalue, useful to consider the signif-

icance level, the test statistics, using a Lagrange multiplier test statistic

and critical values, determined with the number of lags.

Table 5: Engle’s ARCH test

h hypothesis pValue stats cValue

DAX 30 1 0 13.1552 3.8415

CAC 40 1 0 22.6292 3.8415

FTSE 100 1 0 138.2454 3.8415

FTSE MIB 1 0 19.9905 3.8415

SP 500 1 0 295.8729 3.8415

NASDAQ 1 0 393.9357 3.8415

HANG SENG 1 0 122.7761 3.8415

CHINA 50 1 0 167.7209 3.8415

TOPIX 1 0 61.1238 3.8415

SPI 100 1 0 295.705 3.8415

We immediately note all the time series rejects the null hypothesis in

favour of the alternative. This in an additional evidence to fit a model

that use the conditional variance. The time series statistics far exceeds

the critical values. The p-value is 0, offering an further confirm of the re-

jection of the null Hypothesis. The analysis confirmed that an ARCH(1)

model could be better than no consider any ARCH model.

63
3.7 The empirical EWMA

We already explained the structure of the EWMA model. That is sim-

ply a restricted integrated GARCH (iGARCH) model, with the restriction

that the intercept ω is equal to zero, with the smoothing parameter (λ)

equivalent to the autoregressive parameter β in the GARCH equation.

We can analyse every time series making a backtest and evaluate, using

the VaR, how many times the historical returns exceed the confidence

intervals. With the backtesting I determine the accuracy of the EWMA

model and I can consider the model with the GARCH(1,1) and with the

GARCHVIX(1,1,1). The λ parameter takes a value between 0 and 1. We

arbitrarily choose λ = 0.94, as in the RiskMetrics paper. We evaluate the

EWMA model with the confidence interval of 95%, therefore the model

fits on the time series when the outliers are exceed the confidence in-

tervals for 5% of times. To compute the graph I used the package Stats

by R. The black line is the log daily time series, the dashed lines are the

confidence intervals, calculated with the VaR. The standard deviation is

calculated with the EWMA formula, using lambda euqal to 0.94 and the

constant equal to zero. The red points are the points beyond the confi-

dence intervals.

64
EWMA DAX30

0.10
0.05
Group Summary Statistics

UCL
0.00

LCL
−0.05
−0.10

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 0.0002214613 Control limits at 2*sigma
StdDev = 0.01148198 No. of points beyond limits = 107

EWMA CAC40
0.05

UCL
Group Summary Statistics

0.00

LCL
−0.05
−0.10

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 0.0001301072 Control limits at 2*sigma
StdDev = 0.01082107 No. of points beyond limits = 113

65
EWMA FTSE100

0.05
Group Summary Statistics

UCL
0.00

LCL
−0.05
−0.10

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = −4.685586e−05 Control limits at 2*sigma
StdDev = 0.009267366 No. of points beyond limits = 108

EWMA FTSEMIB
0.05

UCL
0.00
Group Summary Statistics

LCL
−0.05
−0.10
−0.15

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 5.543939e−05 Control limits at 2*sigma
StdDev = 0.01317496 No. of points beyond limits = 97

66
EWMA SP500

0.10
0.05
Group Summary Statistics

UCL
0.00

LCL
−0.05
−0.10

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 0.0002784529 Control limits at 2*sigma
StdDev = 0.009163892 No. of points beyond limits = 102

EWMA NASDAQ
0.10
0.05
Group Summary Statistics

UCL
0.00

LCL
−0.05
−0.10

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 0.0006105464 Control limits at 2*sigma
StdDev = 0.01120885 No. of points beyond limits = 95

67
EWMA HANGSENG

0.06
0.04
0.02

UCL
Group Summary Statistics

0.00
−0.02

LCL
−0.04
−0.06
−0.08

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 0.0001429119 Control limits at 2*sigma
StdDev = 0.01104393 No. of points beyond limits = 92

EWMA CHINA50
0.15
0.10
0.05
Group Summary Statistics

UCL
0.00

LCL
−0.05
−0.10
−0.15

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 0.000421779 Control limits at 2*sigma
StdDev = 0.01547745 No. of points beyond limits = 104

68
EWMA TOPIX

0.05
Group Summary Statistics

UCL
0.00

LCL
−0.05

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = 0.0001687 Control limits at 2*sigma
StdDev = 0.01047247 No. of points beyond limits = 108

EWMA SPI100
0.05

UCL
Group Summary Statistics

0.00

LCL
−0.05
−0.10

1 42 90 145 207 269 331 393 455 517 579 641 703 765 827 889 951 1019 1094 1169 1244 1319 1394

Group
Number of groups = 1418 Smoothing parameter = 0.94
Center = −2.176164e−05 Control limits at 2*sigma
StdDev = 0.00935759 No. of points beyond limits = 90

Figure 5: EWMA model with the confidence intervals

69
As I already explained, the EWMA model is a simple model because

use only a parameter λ, suggested by J.P. Morgan to be set at 0.94 for

daily returns. As we can see, in the graphs there are many points that go

beyond the confidence intervals. For example, just the DAX 30 shows 18

points on the range of the last year. Every graph shows the unconditional

standard deviation and the number of points beyond limits. in the DAX

30 index, the number of points beyond limits are 107, having 1418 ob-

servations means that the time series exceeds the 7.5% of times under

evaluating the risk. The others indices shows the same thing, therefore

this model is confirmed as a underestimation of the volatility in the em-

pirical analysis. Among them, there are differences in volatility, range

of volatility and number of points beyond the limits but in complex-

ity they give us the information of the wrong model. The China 50, for

example, has highest volatility with a standard deviation of 15%. The

main disadvantage of the EWMA model is the fact that λ is constant

and identical for all assets. This implies that it is not optimal for any

asset in the sense that the GARCH models discussed above are optimal.

The EWMA model by definition gives inferior forecasts compared with

GARCH models, even though the difference can be very small in many

cases.

70
3.8 The structure of the GARCH VIX model

The majority of volatility forecast models in regular use belongs to

the GARCH family of models. Our model is an implementation of the

GARCH model using the VIX as an exogenous variable. Therefore, I use a

conditional volatility model with the implied volatility determined with

the variance swaps, explained above. Our approach is based on the fact

that VIX approximates the 30-day variance swap rate on the S&P 500.

The VIX can be interpreted to measure the risk-neutral expectation of

integrated variance within the months, providing forward-looking pa-

rameter estimates during the risk-neutral measure.22

GARCH model parameters are often estimated by the Maximum Like-

lihood Estimation (MLE) method using return time series. The goal of

the model is use information of the VIX index to estimate GARCH mod-

els and to improve their performance. Parameter estimates are found

by maximizing the likelihood of returns and VIX. Therefore, I estimate

the parameters ω, α, β and δ, useful to evaluate how the model fit. First

of all, I explain the MLE and I estimate the parameters considering the

GARCH model and the GARCHVIX model. Then, I compare to evaluate

which one fit better than the other one.

22
See, Kanniainen, Binghuan & Yang (2014).[21]

71
3.9 The Maximum Likelihood

This section introduces the maximum likelihood estimation useful

for non-linear models instead of the regression methods such as ordi-

nary least squared (OLS), useful in linear models. Bollerslev and Wooldridge

(1992)23 demonstrate that using the normal distribution in maximum

likelihood estimation will give consistent parameter estimates if the sole

aim is estimation of conditional variance, even if the true density is non-

normal. This estimator is known as the quasi-maximum likelihood (QML)

estimator. However, QML is not efficient unless the true density actually

is normal.

So, I estimate the parameters with the Maximum Likelihood in the

following way:

Let Yn = (Y1 , ......, Yn )0 be a random sample and let θ = (θ1 , ....., θp )0 be

a vector of parameters. Let f (Yn |θ ) be the conjoint density of Yn , which

depends on the parameters. If Yn is an iid sample, then:

n
Y
f (Yn |θ ) = f (Yi |θ )
i =1

viewed as a function of θ , is called the Likelihood function. The Max-

imum Likelihood Estimator (MLE) is the value of θ that maximizes the

likelihood function.

The theta of the model GARCHVIX are omega, alpha, beta and delta

taking into account the thresold of every parameter.


23
See, Bollerslev, Wooldridge (1992).[18]

72
θ = [ω; α; β ; δ]

Assume the errors, εt in a GARCH (p,q) VIX(s) model are standard

normally distributed:

Yt = σt εt

εt ∼ N (0, 1).

p
X q
X s
X
σ2t =ω+ αi Yt 2−i + β j σ2t − j + δs V I X t2−s
i =1 j =1 s =1

where αi , β j and δs are parameters of the model GARCHVIX(p,q,s). The

presence of lagged returns means that the density function for t = 1 is

unknown since we do not know y0 .

T T T
Y T −1 1X 2 1 X Yt 2
f Yi (Yi , θ ) = − l o g (2π) − l o g (σt ) −
i =1
2 2 t =2 2 t =2 σ2t

Therefore, substituting the σ2t in the corresponding function of the

maximum likelihood I obtain:

T T p q s
Y T −1 1X X
2
X
2
X
f Yi (Yi , θ ) = − l o g (2π)− l o g (ω+ αi Yt −i + β j σt − j + δs V I X t2−s )...
i =1
2 2 t =2 i =1 j =1 s =1

73
T
1X Yt 2
... −
2 t =2 ω + pi=1 αi Yt 2−i + qj=1 β j σ2t − j + ss =1 δs V I X t2−s
P P P

The output is the GARCHVIX model derived from a normal distribu-

tion using the MLE. The parameter estimates are obtained by numer-

ically maximizing the likelihood function with an algorithm called an

optimizer. This can lead to numerical problems that adversely affect

maximization or numerical instability. Anyway, I created the model in

MATLAB and I estimate the parameters with the function fmincon.

3.10 Estimation

We can estimate the parameters using the maximum likelihood in

MATLAB. I created a model computing the likelihood function of a nor-

mal distribution adding the VIX as a exogenous variable. Then I cre-

ated a model, after I inserted the constraints to guarantee the positive

variance. The output of the function fmincon give me the value of the

maximum likelihood, estimate of the parameters and the hessian. I did

the inverse of the hessian and I calculated the variance. Then I extrap-

olated the standard error of every parameter using the square root and

then I calculated the test statistics of every parameter. We have used a

GARCH(1,1) VIX(1) as a construction of the model and then I compare

with the GARCH(1,1) model. Therefore,

74
T
Y T −1 1X
f Yi (yi , θ ) = − l o g (2π)− l o g (ω+α1 Yt 2−1 +β1 σ2t −1 +δ1 V I X t2−1 )−...
i =1
2 2 t

1X Yt 2
... −
2 t ω + α1 Yt 2−1 + β1 σ2t −1 + δ1 V I X t2−1

where the p,q and s becomes 1,1,1.

After a computation in MATLAB of the function maximum likelihood

of the model, the management of the data and the constraints I esti-

mated the parameter. The model created is therefore a conditional vari-

ance with time t using the MLE to estimate the parameters of the histori-

cal returns and of the historical conditional variance, therefore a GARCH

adding the historical returns of the VIX. The parameters are estimated

for every index and below the estimate of every parameter there is the

t-statistics. When the t-statistics is below 1.96 is statistically significant

reject the parameter with a normal distribution.

75
The output by MATLAB of the fmincon function are:

Table 6: Estimated parameters with the maximum likelihood

omega alpha beta delta


DAX 30 0.0001 0.2065 0.7915 0.0138
(19.43) (5.38) (8.63) (4.12)
CAC 40 0.0001 0.3108 0.6891 0.0157
(21.82) (82.42) (60.40) (11.77)
FTSE 100 0.0001 0.5343 0.4653 0.0033
(19.53) (10.46) (0.44) (3.74)
FTSE MIB 0.0001 0.1927 0.7988 0.0188
(20.93) (4.53) (1.62) (7.08)
S&P 500 0.0001 0.6086 0.3914 0.0007
(21.54) (7.42) (3.86) (0.92)
NASDAQ 0.0001 0.4799 0.519 0.0005
(20.77) (9.64) (2.06) (0.47)
HANG SENG 0.0001 0.1216 0.871 0.0117
(20.28) (7.75) (6.65) (6.24)
CHINA 50 0.0002 0.277 0.7201 0.0059
(22.11) (6.74) (0.25) (2.40)
TOPIX 0.0001 0.1915 0.8041 0.0176
(17.74) (5.00) (1.31) (7.81)
SPI 100 0.0001 0.424 0.576 0.0074
(17.37) (9.66) (2.03) (6.78)

In the table 6 we have the estimated parameters of every index. Below

the parameter there is the test statistics. A test statistics contains infor-

mation about the data that is relevant for deciding whether to reject the

null hypothesis. As we can see we have the parameters of the S&P 500

and the NASDAQ where there is strong significance levels to reject the

parameter δ. Below the 0.05, therefore the 1.96 in a normal distribution,

the parameter is irrelevant in the model. The correlation between these

two indices is very high and the result follows the criteria that to analyze

76
one of the other give me the same result. The others indices does not

reject the δ parameter give us the information that the model fit better

with the δ parameter. This is not true in the case of β parameter where

some indices reject it because is statistically significant.

3.11 Diagnosing volatility models

There are several statistics methods to compare models. We can use

standard methods such as the t-test to see whether the parameters are

statistically significantly different from zero or not. We start using the

Likelihood Ratio test (LR test) that is used when one model nests inside

another model. We considered the unrestricted model the GARCHVIX(1,1,1)

and the restricted model the GARCH(1,1). The restricted log-likelihood

minus the unrestricted log-likelihood, doubled, follows the chi-squared

distribution, with the degrees of freedom equalling the number of re-

strictions. In this case we have only one degree of freedom an therefore

is 3.84 with an α of 5%. I have also computed the Akaike information

criterion (AIC) and the Bayesan information criterion (BIC), the infor-

mation criteria test to select the model with the lowest value.

77
Table 7: Diagnostics table

MLE MLE LR AIC AIC BIC BIC

GARCH GARCH TEST GARCH GARCH GARCH GARCH

(1,1) VIX VIX VIX

(1,1,1)

DAX 30 4179.2 4208.9 59.4 -8354.4 -8411.8 -8343.5 -8395.45

CAC 40 4214.7 4265 100.6 -8425.4 -8524 -8414.5 -8507.65

FTSE 100 4548.3 4556.6 16.6 -9092.6 -9107.2 -9081.7 -9090.85

FTSE MIB 3974.7 4016.2 83 -7945.4 -8026.4 -7934.5 -8010.05

SP 500 4571.9 4572.2 0.6 -9139.8 -9138.4 -9128.9 -9122.05

NASDAQ 4303.8 4303.9 0.2 -8603.6 -8601.8 -8592.7 -8585.45

HANG SENG 4259 4289 60 -8514 -8572 -8503.1 -8555.65

CHINA 50 3709.3 3713.3 8 -7414.6 -7420.6 -7403.7 -7404.25

TOPIX 4308.4 4366.4 116 -8612.8 -8726.8 -8601.9 -8710.45

SPI 100 4626.5 4674.4 95.8 -9249 -9342.8 -9238.1 -9326.45

The table 7 shows the LR test and the information criteria. We can

see that the LR test confirm a better model when we have a GARCHVIX

model rather than the GARCH model in every index with two exception,

the S%P 500 and the NASDAQ. This is what I determined before with

the test statistics using the maximum likelihood. The results does not

change and the AIC and BIC give us the same result.

78
3.12 Backtesting and VaR

We can diagnose individual models by testing for parameter signifi-

cance or analyzing residuals, but these methods often do not properly

address the risk-forecasting property of the models under considera-

tion. We consider an important tool used in risk management, the Value-

at-risk (VaR). It is a measure of risk exposure associated with a particular

portfolio of assets. The VaR of a portfolio is defined as the maximum loss

occurring within a specified time and with a give probability. The valid-

ity of the VaR measure is then backtested by comparing the number of

expections. Backtesting is a procedure that can be used to compare the

various risk models. It aims to take ex ante Value at risk (VaR) forecasts

from a particular model and compare them with ex post realized return

(historical observations). When ever losses exceed VaR, a VaR violation

is said to have occurred. Models that do not perform well during back-

testing should have their assumptions and parameter estimates ques-

tioned. With the backtesting we can consider if the model overestimate

or underestimate the conditional variance. We evaluate the violation ra-

tios, therefore every value that go beyond limits of 2∗σt for a GARCH and

for a GARCHVIX model. We show the table 8 where the percentile should

be near 5% to be an accurate model in the time series. Then, I computed

the graphs on the GARCH model on the left and the GARCHVIX model

on the right.

79
Table 8: Backtesting

EWMA GARCH(1,1) GARCHVIX(1,1,1)

DAX30 7.546% 5.924% 6.417%

CAC40 7.969% 5.501% 5.853%

FTSE100 7.616% 5.501% 5.853%

FTSEMIB 6.841% 4.654% 5.148%

SP500 7.193% 5.642% 5.501%

NASDAQ 6.700% 5.571% 5.642%

HANGSENG 6.488% 5.783% 5.571%

CHINA50 7.334% 5.501% 5.289%

TOPIX 7.616% 5.078% 5.571%

SPI100 6.347% 6.065% 5.994%

80
81
Figure 6: Conditional variance

82
Unfortunately, the GARCHVIX model is better only in some cases an-

alyzing the time series. I have painted in yellow the best results and we

do not have a single solution for every time series but depends on the

time series. The results here, therefore, suggest that including VIX in the

GARCH variance equation does not unequivocally increase the forecast

power of the GARCH model. Another approach to evaluate the mar-

ket risk and therefore the violation ratio we can consider the Expected

Shortfall, more sensitive to the shape of the tail of the loss distribution.

Expected shortfall (ES) is considered a more useful risk measure than

VaR because it is a coherent, and moreover a spectral, measure of finan-

cial portfolio risk. It is calculated for a given quantile-level q and in our

analysis we consider a 5%.

Table 9: Expected Shortfall

EX SHORTFALL GARCH (1,1) GARCHVIX (1,1,1)


DAX30 5.078% 5.289%
CAC40 4.654% 5.078%
FTSE100 5.148% 4.937%
FTSEMIB 4.020% 4.372%
SP500 4.866% 4.795%
NASDAQ 4.795% 4.795%
HANGSENG 4.584% 4.937%
CHINA50 4.372% 4.372%
TOPIX 4.513% 5.007%
SPI100 5.007% 5.148%

The table shows that an accurate evaluation could be useful to deter-

mine the best model. In most of the cases the GARCHVIX model is better

83
than the GARCH model, considering that sometimes underperform or

overperform the conditional variance. Some cases show an equal eval-

uation in the both models showing no difference of violation ratio.

84
4 Concluding remarks

In this thesis we obtained an univariate approach of the model, ap-

propriate to contextualize the rigorous analysis of different volatility method-

ologies focusing mainly in the GARCH model implementing the VIX In-

dex. With an empirical analysis we analysed the impact of the volatility

clustering of the major global indices focusing of an appropriate model

that explain the time series. Within on this research the author works

on the dangerousness of volatility in the financial markets estimating

the parameters useful to determine the conditional variance. The aim

is forecasting a potential increase of volatility and measure the risk of

the portfolios, anticipating huge variations of volatility with relative un-

certainty. In the first chapter we studied the evaluation of the VIX, how it

is computed and how is changed from the creation of the index to now.

Essentially, we eliminated the common assumption that the VIX index

is calculated using the Black-Scholes Merton (BSM) model, explained

the structure of the VIX as a variance swaps. We confirmed that the new

model model is more accurate than the previous one using the forward-

looking plain vanilla options of the Standard & Poor’s 500 index. The

model that we created, a GARCH with an implementation of the VIX in-

dex as a exogenous variable helps to determine if the volatility index is

useful for the accuracy of the model. Not only with the GARCH model we

compared our model but also with the EWMA model, considering a 5%

of confidence on the variance. In the first part of the empirical research

85
the author examined the major indices using the descriptive statistics

and some relative tests useful to determine and fix the time series. After

being ensured that the volatility clustering on time series and therefore

that the constructed model is useful in the real world, the author esti-

mated the parameters with the maximum likelihood estimator looked

out to constraints of the model. After an estimate of the parameters the

model was created and the conditional variance estimated. Then, the

model compared with the GARCH and the EWMA investigating the ac-

curacy of the conditional variance on the models. The backtesting was

useful to determine, with the Value at Risk and the expected shortfall,

which one is the best model and be the more accurate one. However,

not all assets that we considered are characterized by the same trend and

thus the model perform differently for every time series, Hence, apply-

ing the model, we notice these differences and we cannot confirm if one

model is better than the other one. Summarizing, both the GARCH (1,1)

and the GARCH VIX (1,1,1) model are useful and accurate to the time

series, showing a different accuracy within of different time series, not

too distant of the 5% of confidence. The backtesting using the expected

shortfall give an accurate consistence of the estimate of the conditional

variance providing an good result on the evaluation of the model. An

interesting citation by George Box that said “All models are wrong but

some are useful” explains how we should fit the model on the time series

considering the volatility clustering. Future research should determine

86
and analyse the forecasts of the conditional variance using ARIMA mod-

els. Then, consider different distributions as the student distribution

or taking different assets or different temporal series on the evaluation.

Another implementation could be using the state space model, using

the state variables to describe a system by a set of first-order differen-

tial or difference equations. The state-space model provides a flexible

approach to time series analysis, especially for simplifying maximum-

likelihood estimation and handling missing values. This provides a good

starting point for discussion and further research.

87
5 Bibliography

References

[1] Ruey S. Tsay (2010). Analysis of Financial Time Series (Third Edi-
tion). The University of Chicago Booth School of Business Chicago,
IL. 109-140

[2] Vega Ezpeleta. (2015). Modeling volatility for the Swedish stock
market. Uppsala University, Department of Statistics, Advisor: Lars
Forsberg. 4-20

[3] Andrea Pastore (2018). Statistical methods for Risk Analysis. Ca’
Foscari University of Venice. 20-25

[4] Ruppert D., Matteson D.(2015). Statistics and Data Analysis for Fi-
nancial Engineering, Second Edition 2015. 45-70. 307-451

[5] Tim Bollerslev (1987). A conditionally Heteroskedastic Time Series


for speculative prices and Rates of Return. TH eReview of Eco-
nomics and Statistics, Aug, 1987, Vol. 69, No. 3 pp.542-547

[6] Benoit Manderlbrot (1963). The Variation of Certain Speculative


Prices. The Journal of Business, Vol. 36, No. 4 (Oct., 1963), pp. 394-
419

[7] Hinich and Patterson(1985). Evidence of Nonlinearity in Daily


Stock Returns, Journal of Business and Economic Statistics 3 (1)
(1985), 69-77

[8] Rama Cont (2005). Volatility Clustering in Financial Markets: Em-


pirical Facts and Agent-Based Models. Centre de Mathematiques
appliquees, Ecole Polytechnique F-91128, France. 1-21

[9] Tim Bollerslev (1986). Generalized Autoregressive Conditional Het-


eroskedasticity. University of California at San Diego, La Jolla, CA
92093, USA. Institute of Economics, University of Aarhus, Denmark

[10] Kevin Shappard (2020). Univariate Volatility Modeling. Oxford MFE


413-414

[11] Jon Daniélsson(2011). Financial Risk Forecasting. 18-32

88
[12] Jim Gatheral (2006). The Volatility Surface: A Practitioner’s Guide.
Wiley

[13] Robert F. Engle (1982). Autoregressive Conditional Heteroscedas-


ticity with Estimates of the Variance of United Kingdom Inflation.
Econometrica Vol. 50 No. 4 pp. 897-1007

[14] Vedat Akgiray (1989). Conditional Heteroscedasticity in Time Series


of Stock Returns: Evidence and Forecasts. The Journal of Business,
1989, vol. 62, issue 1, 55-80

[15] CBOE VIX (2019). White Paper. Cboe Volatility Index. Cboe Ex-
change Inc. 1-20

[16] John C. Hull. (2018). Options, Futures and other Derivatives. cap.15
cap. 26 ; Technical Note No. 22* Valuation of a Variance Swap

[17] Hao Zhou and Matthew Chesnes. (2003). Vix Index Becomes Model
Free and Based on S&P 500, Board of Governors of the Federal Re-
serve System, division of research and statistics

[18] Bollerslev, Wooldridge (1992). Quasi Maximum Likelihood Estima-


tion and Inference in Dynamics Models with Time Varying Covari-
ances. Econometric Reviews 11(2) 143-172

[19] Hao Zhou and Matthew Chesnes. (2003). Vix Index Becomes Model
Free and Based on S&P 500, Board of Governors of the Federal Re-
serve System, division of research and statistics

[20] Dimos S. Kambouroudis and David G: McMillan. (2016). Does VIX


or volume improve GARCH volatility forecasts?. Stirling Manage-
ment School, University of Stirling, UK

[21] Kanniainen, Binghuan & Yang (2014). Estimating and Using


GARCH Models with VIX Data for option valuation

[22] Reto R. Gallati, (2003). 15.433 Investments Class 2: Securities, Ran-


dom Walk on Wall Street. MIT Sloan School of Management

[23] J. M. Bland, D. G. Altman (1996). Measurement error, Department


of Public Health Sciences, St George’s Hospital Medical School,
London.

89
[24] McAleer (2007). Volatility Index (VIX) and S&P 100 Volatility Index
(VXO) School of Economics and Commerce University of Western
Australia and Faculty of Economics Chiang Mai University

90
List of Figures

1 Time series Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44


2 Logarithmic daily returns . . . . . . . . . . . . . . . . . . . . . . . . 46
3 Distribution analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Squared Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5 EWMA model with the confidence intervals . . . . . . . . . . . 69
6 Conditional variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

List of Tables

1 Jarque-Bera test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2 Descriptive statistics and Ljung-box test . . . . . . . . . . . . . 53
3 Correlation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4 Correlation with the VIX . . . . . . . . . . . . . . . . . . . . . . . . . 56
5 Engle’s ARCH test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6 Estimated parameters with the maximum likelihood . . . . 76
7 Diagnostics table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8 Backtesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
9 Expected Shortfall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

91
A Appendix

A.1 A Appendix

A Appendix: tests
Jarque-Bera test

In statistics the Jarque-Bera test is a test where the sample data have
the skewness and kurtosis that match a normal distribution. The test
statistics give always a positive result. The skewness of a normal distri-
bution is 0 and the kurtosis is 3, when is far from zero, it means that the
data do not have a normal distribution.
The test statistic JB is defined as:

n 2 1
JB = (S + (K − 3)2 ) (50)
6 4
where n is the number of observations (or degrees of freedom in gen-
eral); S is the sample skewness and K is the sample kurtosis. If the data
comes from a normal distribution, the Jarque Bera statistics has a chi-
squared distribution with 2 degrees of freedom, so the statistics can be
used to test the hypothesis that the data are from a normal distribution.
The null hypothesis is a joint hypothesis of the skewness being zero and
the excess kurtosis being zero. Samples from a normal distribution have
an expected skewness of 0 and an expected excess kurtosis of 0 (which is
the same as a kurtosis of 3). As the definition of JB shows, any deviation
from this increases the JB statistic.
Ljung-Box test
The Ljung-Box test is a type of statistical test of whether any of a group of
autocorrelations of a time series are different from zero. Instead of test-
ing randomness at each distinct lag, it tests the "overall" randomness
based on a number of lags, and is therefore a portmanteau test. The test
statistic is:
h
X ρk2
Q = n(n + 2) (51)
k =1
n − k
where n is the sample size, ρk is the sample autocorrelation at lag
k, and h is the number of lags being tested. Under H0 the statistic Q
asymptotically follows a χh2 . For significance level a, the critical region

92
for rejection of the hypothesis of randomness is:

2
Q > χ1−α,h (52)
where χ1−α,h
2
is the 1-a-quantile of the chi-squared distribution with h
degrees of freedom.

93
A.2 B Appendix

B Appendix: table

      

   
   

         
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          

94
     

 
  
    

         
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          
          

95
  

 




   



  

 

          

          
          
          
          
          

          
          
          
          
          
          
          
          
          
          

          
          
          
          
          

          
          
          
          

          
          
          
          
          

96

You might also like