0% found this document useful (0 votes)
75 views8 pages

Arch Models and Conditional Volatility: 1, Write The Series As 1)

The document discusses ARCH (AutoRegressive Conditional Heteroscedasticity) models, which allow the conditional variance of a time series to change over time based on past values, addressing a key limitation of linear stationary models. Specifically, an ARCH model specifies that (1) the conditional distribution of errors is normal with zero mean and time-varying variance ht, and (2) ht is defined as a function of past squared errors. While the unconditional distribution is non-Gaussian, the conditional distributions are Gaussian. ARCH models are often used with ARMA models to account for time-varying volatility in the errors.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views8 pages

Arch Models and Conditional Volatility: 1, Write The Series As 1)

The document discusses ARCH (AutoRegressive Conditional Heteroscedasticity) models, which allow the conditional variance of a time series to change over time based on past values, addressing a key limitation of linear stationary models. Specifically, an ARCH model specifies that (1) the conditional distribution of errors is normal with zero mean and time-varying variance ht, and (2) ht is defined as a function of past squared errors. While the unconditional distribution is non-Gaussian, the conditional distributions are Gaussian. ARCH models are often used with ARMA models to account for time-varying volatility in the errors.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ARCH MODELS AND CONDITIONAL VOLATILITY

A drawback of linear stationary models is their failure to account for changing volatility: The width of the forecast intervals remains constant even as new data become available, unless the parameters of the model are changed. To see this for h = 1, write the series as xt =

ak et k k =0

, (a 0 = 1) ,

where {et } is independent white noise. The width of the forecast interval is proportional to the square root of the one-step forecast error variance, var [xn +1 f n , 1] = var [en +1] = e2, a constant. On the other hand, actual nancial time series often show sudden bursts of high volatility. For example, if a recent innovation was strongly negative (indicating a crash, etc.), a period of high volatility will often follow. A clear example of this is provided by the daily returns for General Motors from Sept 1st to Nov 30th, 1987. The volatility increases markedly after the crash, and stays high for quite some time. Nevertheless, the forecast intervals based on a given Gaussian ARMA model would have exactly the same width after the crash as before. This would destroy the validity of post-crash forecast intervals. It would have equally devastating consequences for any methodology which requires a good assessment of current volatility, such as the Black-Scholes method of options pricing. In the ARCH model, which we discuss here, the forecast intervals are able to widen immediately to account for sudden changes in volatility, without changing the parameters of the model. Because of this feature, ARCH (and other related) models have become a very important element in the analysis of economic time series. The acronym ARCH stands for AutoRegressive Conditional Heteroscedasticity. The term

"heteroscedasticity" refers to changing volatility (i.e., variance). But it is not the variance itself which changes with time according to an ARCH model; rather, it is the conditional variance which changes, in a specic way, depending on the available data. The conditional variance quanties our uncertainty about the future observation, given everything we have seen so far. This is of more practical interest to the forecaster than the volatility of the series considered as a whole. To provide a context for ARCH models, lets rst consider the conditional aspects of the linear AR (1) model xt = axt 1 + et , where the {et } are independent with zero mean and equal variances. If we

-2want to predict xt from xt 1, the best predictor is the conditional mean, E [xt xt 1] = axt 1. The success of the AR (1) model for forecasting purposes arises from the fact that this conditional mean is allowed to depend on the available data, and evolve with time. The conditional variance, however, is simply var [xt xt 1] = var [et ] = e2, which remains constant regardless of the given data. The novelty of the ARCH model is that it allows the conditional variance to depend on the data. The concept of conditional probability (and therefore conditional mean and variance) plays a key role in the construction of forecast intervals. It could be argued that a reasonable denition of a 95% forecast interval is one for which the future observation has a 95% probability of falling in the interval, conditionally on the observed data. That is, of all possible realizations (paths) which coincide with the data available up to now, 95% of them should have a future value xn +h which lies within the forecast interval. For the purposes of forecasting, we do not care too much about realizations which are not consistent with the available information. So we do not demand that xn +h must lie within this particular prediction interval in 95% of all possible realizations. The ARMA forecast intervals we derived in an earlier handout have conditional coverage rate of 1 , if the model is correct and assuming that that the innovations are normally distributed. However, these intervals are not valid if the data are actually generated by an ARCH model.

Denition of ARCH Model The ARCH (q ) model for the series {t } is dened by specifying the conditional distribution of t , given the information available up to time t 1. Let t 1 denote this information. It consists of the knowledge of all available values of the series, and anything which can be computed from these values, e.g., innovations, squared observations, etc. In principle, it may can even include the knowledge of the values of other related time series, and anything else which might be useful for forecasting and is available by time t 1.

-3We say that the process {t } is ARCH (q ) if the conditional distribution of t given the available information t 1 is t t 1 N (0 , ht ) , (1)

ht = +
q

i t2i i =1

(2)

with > 0, i 0 for all i , and

i < 1. i =1

Equation (1) says that the conditional distribution of t given t 1 is normal, N (0 , ht ). In other words, given the available information t 1 the next observation t has a normal distribution with a (conditional) mean of E [t t 1] = 0, and a (conditional) variance of var [t t 1] = ht . We can think of these as the mean and variance of of t computed over all paths which agree with t 1. Equation (2) species the way in which the conditional variance ht is determined by the availabl e information. Note that ht is dened in terms of squares of past innovations. This, together with the assumptions that > 0 and i 0, guarantees that ht is positive, as it must be since it is a conditional variance.

Properties of ARCH Models Perhaps it is surprising that if, instead of restricting to paths which agree with the available information t 1 we consider all possible paths, the { t} are a zero mean white noise process. That is, unconditionally, considering all possible paths, we have E [t ] = 0, var [t ] = /(1

i ), a nite coni =1 i < 1, i =1


q

stant, and cov (i , j ) = 0 if i j . So unconditionally, the process is stationary as long as

which we assumed in the denition of the model. It is only the conditional volatility which changes with time, not the overall volatility. In spite of its name, the ARCH model is not autoregressive. However, if we add t = t2 ht to both sides of Equation (2), we get t2 = +

1i t2i + t i=

-4It can be shown that {t } is zero mean white noise. Therefore, the squared process { t2} is an autoregression [AR (q )] with nonzero mean, and AR parameters 1 , . . . , q . The ARCH(q) model is nonlinear, since if the t could be expressed as t =

k =0

ak et k , then we

would have var [t t 1] = var [t et 1 , et 2 , . . . ] = var [et ], a constant. This contradicts Equations (1) and (2), so {t } must not be a linear process. The observations { t} of an ARCH(q) model are non Gaussian, since the model is nonlinear. The distribution of the {t } tends to be more long-tailed than normal. Thus, outliers may occur relatively often. This is a useful feature of the model, since it reects the leptokurtosis which is often observed in practice. Moreover, once an outlier does occur, it will increase the conditional volatility for some time to come. (See Equation (2).) Once again, this reects a pattern often found in real data. It may seem odd that, while the conditional distribution of t given t 1 is Gaussian, the unconditional distribution is not. Roughly speaking, the reason for this is that the unconditional distribution is an average of the conditional distributions for each possible path up to time t 1. Although each of these conditional distributions is Gaussian, the variances ht are unequal. So we get a mixture of normal distributions with unequal variances, which is not Gaussian. Although they are uncorrelated, the { t} are not independent. This is easy to see, since if {t } were independent, then they would be a linear process; but we have already shown that the ARCH (q ) process is in fact nonlinear. One might hope, then, that as in the case of threshold AR and bilinear models, it might be possible to nd some nonlinear predictor of t based on t 1. But since E [t t 1] = 0, we see that the { t } are a Martingale difference. Thus, the very best (linear or nonlinear) predictor of t based on the available information is simply the trivial predictor, namely the series mean, 0. In terms of point forecasting of the series itself, then, the ARCH models offer no advantages over the linear ARMA models. The advantage of the ARCH models lies in their ability to describe the time-varying stochastic conditional volatility, which can then be used to improve the reliability of interval forecasts and to help us in understanding the process. Although the series {t } itself is not forecastable, the squared series {t2} is forecastable: The best

-5forecast of t2 is E [t2 t 1] = var [t t 1] = ht = +

i t2i i =1

Note that we would get the same forecast by using the AR (q ) representation for the squared process {t2} discussed earlier.

ARMA Models With ARCH Errors Because it is Martingale difference and therefore unpredictable, the ARCH (q ) model is not usually used by itself to describe a time series data set. Instead, we can model our data {xt } as an ARMA (k , l ) process where the innovations {t } are ARCH (q ). That is, xt =

a j xt j + j1b j t j + t j =1 =

(3)

where {t } is ARCH (q ). Because {t } is white noise, the model (3) is in many respects just an ordinary ARMA model. In fact, it is just an ARMA model, since our original denition of ARMA models simply required that the innovations be white noise. So the ACF and PACF of the {xt } given by model (3) are the same as usual, i.e., the ACF will cut off beyond lag l if k = 0, and the PACF will cut off beyond lag k if l = 0, etc. In addition, the best linear forecast of xn +h based on xn , xn 1 , . . . will be no different than what we have studied earlier, once again because {t } is a white noise process. In fact, because the {t } are a Martingale difference, it is not hard to show that the best possible predictor of xn +h is simply the best linear predictor. So much of the theory and practice of Box-Jenkins modeling still applies to Model (3). Perhaps the most notable exception to this is forecast intervals. These were derived (e.g., in our earlier handout) under the additional assumption that the innovations are Gaussian. Thus, these intervals will not be valid for model (3), in which the innovations are non Gaussian.

One-Step Forecast Intervals for ARMA Models with ARCH Errors Suppose {xt } is ARMA (k , l ) with ARCH (q ) errors, {t }. Then xn +1 =

j =1

a j xn +1j +

j =1

b j n +1j + n +1

-6-

where {t } is ARCH (q ). Assume that all parameter values and model orders are known. The best forecast of xn +1 based on n is f n , 1 = E [xn +1 n ] =

a j xn +1j + j1b j n +1j j =1 =

since E [n +1 n ] = 0. The corresponding one-step forecast error is xn +1 f n , 1 = n +1. We would like to construct a forecast interval for xn +1 which has a conditional coverage rate of 1 , given the observed information n . To do this, we use our knowledge about the properties of the forecast error, given n . Since {t } is an ARCH process, we know from Equation (1) that, conditionally on n , the forecast error n +1 is distributed as N (0 , hn +1). Thus, 1 = prob { (z /2 hn +1 < n +1 < z /2 hn +1) n } . Adding f n , 1 to all sides of the inequality above, we conclude that 1 = prob { (f n , 1 z /2 hn +1 < xn +1 < f n , 1 + z /2 hn +1 ) n } . Thus, we have just shown that a one-step forecast interval, with a conditional coverage rate of 1 , is given by f n , 1 z /2 hn +1 . The interval can actually be constructed since hn +1 depends only on the information available at time n . The width of the one-step forecast intervals is 2 z /2 hn +1. This width will increase immediately in the event of a sudden large uctuation in the series. (See Equation (2)).

Simulation of ARCH Models The denition of an ARCH model (Equations (1) and (2)) is stated in terms of the conditional distribution of t given t 1. This does not tell us explicitly how to generate an ARCH process. Here is one way to actually construct an ARCH process in terms of independent standard normals, {et }: First, set t = et for t = q +1 , . . . , 0. These are just "start-up" values which allow us to compute h 0 , . . . , hq . Next, for t = 1 , . . . , n , simply set t = et ht where ht is given by equation (2). Then for t = 1 to n , if we condition on t 1, then ht can be treated as a constant, so that t has a normal distri-

-7-

bution with mean 0 and variance ht . Thus, {t }tn=1 is indeed ARCH (q ). Using this method, I simulated an ARCH(1) data set with n = 100, = .25, = .5, and an ARCH(2) data set with n = 100, = .25, 1 = .6, 2 = .35. For the ARCH(1) data, the volatility increases strongly in the second half of the series. Integrating this series (i.e., taking partial sums) gives a "random walk" with ARCH errors. The integrated series has a "crash" around t = 50. For the ARCH(2) and there are two positive outliers at t = 61 and t = 63, and two more at t = 80 and t = 81. These cause sudden strong upswings in the integrated series.

Graphical Methods For Selecting q in ARCH (q ) Models. Because the ARCH (q ) process t is white noise, we would typically expect to nd that the ACF and PACF of the raw data are not signicant at most lags. Thus, they are not useful for selecting a value of the unknown order q of an ARCH process. Examination of the ACF and PACF of the squared series, {t2} will often be more fruitful. Since, as we have seen, the squared series obeys an AR (q ) model, we would hope to nd that the PACF of the squared series cuts off beyond lag q , while the ACF of the squared series "dies down". For the simulated ARCH (1) series, the ACF and PACF for the raw data are nonsignicant at all lags, the ACF of the squares dies down, and the PACF of the squares is highly signicant at lag 1, and just barely signicant at lag 2. This suggests an ARCH (1), or perhaps an ARCH (2). For the simulated ARCH (2) series, the ACF of the raw data is almost signicant at lag 1, barely signicant at lag 4, but nonsignicant at all other low lags. As we will explain, it is sensible to ignore the (marginally) signicant lags of the ACF of the raw data, and declare the series a white noise, though perhaps not strict white noise. The PACF of the squares exhibits a clear cutoff beyond lag 1. Thus, using the graphical method we would (incorrectly) identify the process as ARCH (1). Since any ARCH process is also a white noise process, its true ACF and PACF must be zero at all lags. Of course, the sample ACF and PACF will all differ from zero, due to sampling variation. We have been assuming throughout the course that for any white noise process, the sample ACF and PACF will have a standard error of 1/n . Actually, this is not necessarily true unless we have strict

-8-

(independent) white noise. For ARCH (1) processes, it can be shown that the standard error of the sample ACF at lag k is var [rk ] 1 k 2 [1 + 2 1 /(1 31 )] . n

Thus, the variance of rk will be greater than 1/n in this case. For example, if 1 = .4, then the variance of r 1 is approximately 2.5/n , compared to 1/n if no ARCH effects are present. So if our data were generated by this ARCH (1) model, we would nd that rk is signicantly different from zero much more than 5% of the time, even though the series is white noise. It is clear, then, that using the ACF and PACF of the raw data to decide whether the series is white noise remains more of an art than a science. On the other hand, using the ACF and PACF of the squares to decide whether the raw series is independent is a justiable procedure. This is so because if {t } is strict white noise and n is large, the sample ACF and PACF values for the squared series are (essentially) normally distributed with zero mean and variance 1/n . Thus any signicant lags of the ACF or PACF of the squares can be taken as evidence that the raw series is not independent. Instead of using graphical methods, which may be problematic, it is possible to use AICC to select q automatically for the ARCH (q ) model, but this requires that we estimate the ARCH parameters themselves.

You might also like