## Tuesday, June 30, 2015

### Tsay Ch3 - Conditional Heteroscedastic Models

#### Ch3 : Conditional Heteroscedastic Models

Concerned with modeling of volatility of an asset return. Volatility modeling, apart from being useful in options pricing, provides simple approach to calculate VaR in risk management. It also plays an important role in asset allocation under mean-variance framework. It can also improve the efficiency in parameter estimation and the accuracy in interval forecast. Implied volatility tend to be larger than that obtained by using a GARCH type of volatility model. Some characteristics are:
1) there exist volatility clusters
2) volatility jumps are rare
3) does not diverge to infinity
These implies that volatility is generally stationary.
4) leverage effect - volatility reacts differently to big price increases vs big price drops

Returns are generally serially independent but the square of them are not. The general mean equation is $r_t=\mu_t+a_t$ where,
$$\mu_t=\phi_0+\sum_{i=1}^k \beta_i x_{it}+\sum_{i=1}^p \phi_i r_{t-i}-\sum_{i=1}^q \theta_i a_{t-i}$$
where $k$, $p$ and $q$ are non-negative integers, and $x_{it}$ are explanatory variables (e.g. exogenous variables or dummy variables). The conditional variance (or volatility) is given by
$$\sigma_t^2=Var(r_t|F_{t-1})=Var(a_t|F_{t-1})$$
The model of volatility (called volatility equation) can be of two types - one with exact function and the other with stochastic equation.

The first step is always to model the mean equation (e.g. ARMA) and check residuals for ARCH effects. We then fit volatility models and then jointly estimate the mean and volatility equations. Let the mean equation be $r_t=\mu_t+a_t$, with $\mu$ being the simple average of the series. The squared series $a_t^2$ is then used to check for conditional heteroscedasticity or ARCH effects. Two tests:
1) Ljung-Box test on $a_t^2$ series: $H_0:$ first m lags of ACF of $a_t^2$ series are zero.
2) Lagrange multiplier test of Engle: Equivalent to F-statistic for testing $\alpha_i=0$ in the linear regression $a_t^2=\alpha_0+\alpha_1 a^2_{t-1}+...+\alpha_m a^2_{t-m}+e_t$, for $t=m+1,...,T$

ARCH model: Shock $a_t$ is serially uncorrelated but dependent at the quadratic level as $a_t=\sigma_t\epsilon_t$, where
$$\sigma_t^2=\alpha_0+\alpha_1 a^2_{t-1}+...+\alpha_m a^2_{t-m}$$
where ${\epsilon_t}$ is a sequence of iid with mean zero and variance 1, $\alpha_0>0$ and $\alpha_i\ge 0$ for $i>0$. $\epsilon_t$ can be modeled using standard normal, student-t or generalized error distribution. Using heavy-tailed distribution will reduce the ARCH effect. This models volatility clustering.

Properties of ARCH(1): $E[a_t]=E[E[a_t|F_{t-1}]]=E[\sigma_t E[\epsilon_t]]=0$. Also, $Var(a_t)=\alpha_0/(1-\alpha_1)$. Making higher orders finite can impose further conditions, like the unconditional kurtosis is
$$\frac{E[a^4_t]}{[Var(a_t)]^2}=3\frac{1-\alpha_1^2}{1-3\alpha_1^2}>3$$
Thus, ARCH(1) model admits heavier tails than that of a normal distribution. For kurtosis to be finite $\alpha^2_1\le 1/3$. The model assumes that the positive and negative shocks have same effects. They are likely to over-predict the volatility because they respond slowly to large isolated shocks.

Order determination: using PACF of $a^2_t$ one can select the ARCH order, unless the sample size is small. Estimation is done using log likelihood estimation. After fitting standardized residuals $a_t/\sigma_t$  should be checked for iid using Ljung-Box statistics, skewness, kurtosis and qqplot. Forecasting is similar to AR model.

GARCH model: To reduce the number of parameters we use GARCH model. A GARCH(m,s) is given by $a_t=\sigma_t \epsilon_t$ and
$$\sigma^2_t = \alpha_0 + \sum_{i=1}^m \alpha_i a^2_{t-i} + \sum_{j=1}^s \beta_j \sigma^2_{t-j}$$
where again ${\epsilon_t}$ is an iid with mean 0 and variance 1, $\alpha_0>0$, $\alpha_i \ge 0$, $\beta_j \ge 0$, and $\sum_{i=1}^{max(m,x)}(\alpha_i+\beta_i)<1$. $\alpha_i$ and $\beta_j$ are referred to as ARCH and GARCH parameters, respectively. This shows similar properties as ARCH with ARMA like characteristics.

Difficult to specify the model. Only lower order models used. Log likelihood is used and the fitted model is checked using standardized residual $a_t/\sigma_t$ and its squared process. $a^2_{h+1}$ is a consistent estimate of $\sigma^2_{h+1}$ but is not an accurate estimate of the prediction, because a single observation of a random variable with a known mean and value cannot provide an accurate estimate of its variance.

IGARCH model: unit-root GARCH model. $\alpha_i+\beta_i=1$. The unconditional variance of $a_t$, hence that of $r_t$, is not defined under the IGARCH model. Theoretically, this might be caused by occasional level shifts in volatility. For the case of $\alpha_0=0$, IGARCH(1,1) model become exponentially smoothing of volatility model.

Stochastic volatility model: To ensure positiveness of the conditional variance, SV models use $ln(\sigma^2_t)$ instead of $\sigma^2_t$. It is defined as $a_t=\sigma_t \epsilon_t$
$$ln(\sigma^2_t)=\alpha_0+\alpha_1 ln(\sigma^t_{t-1})+\nu_t$$
where $\epsilon_t$ are iid $N(0,1)$, the $\nu_t$ are iid $N(0,\sigma^2_{\nu})$, ${\epsilon_t}$ and ${\nu_t}$ are independent, $\alpha_0$ is a constant and all zeros of the characteristic polynomial are greater than 1 in modulus.

Quasi-likelihood methods via Kalman filtering or MCMC methods are used for estimation. Limited experience shows that SV models often provided improvements in model fitting, but their contributions to out-of-sample volatility forecasts received mixed results.

Alternative approaches: Two approaches
1) High-Frequency Data - Use of high-frequency data to calculate volatility of low-frequency returns. The model for daily returns is unknown which can complicate the monthly estimation. Assuming $r_{i}$ are log daily returns for the monthly returns $r^m$, we have $r^m=\sum^n r_{t}$. For white noise we have $Var(r^m)=nVar(r_i)$. For MA(1) process it is $Var(r^m)=nVar(r_i)+2(n-1)Cov(r_i+r_{i+1})$. Further the sample size is only 21 days making the accuracy questionable. If the kurtosis and serial correlations are high the sample estimate is not even consistent.
This concept can be used to calculate daily volatility from intraday log returns. At this level the realized volatility ($\sum r^2$) approximately follows Gaussian ARIMA(0,1,q) model, which can be used to produce forecasts. Intuitively one would use the smallest possible interval, but anything less than 15 minutes bring in noise. Overnight returns also need to be taken into account for correct daily volatility estimation.
2) Daily OHLC Prices - Shown to improve volatility estimate. Let $C_t$, $O_t$, $H_t$, and $L_t$ be the Close, Open, High and Low for day t. Also let $f$ be the fraction of day that trading is closed. The conventional volatility estimate is $\sigma^2_t=E[(C_t-C_{t-1})^2|F_{t-1}]$. Based on price following simple diffusion model without drift various estimates of volatility has been proposed, one being $0.5(H_t-L_t)^2-[2ln(2)-1](C_t-O_t)^2$. Yang and Zhang (2000) also proposed an estimate on the lines of $$\hat{\sigma}^2_{yz}=\hat{\sigma}^2_o+k\hat{\sigma}^2_c+(1-k)\hat{\sigma}^2_{rs},$$
where
$$\hat{\sigma}^2_o = \frac{1}{n-1}\sum^n_{t=1}(o_t-\bar{o})^2 \qquad o_t=ln(O_t)-ln(C_{t-1})$$
$$\hat{\sigma}^2_c = \frac{1}{n-1}\sum^n_{t=1}(c_t-\bar{c})^2 \qquad c_t=ln(C_t)-ln(O_t)$$
$$\hat{\sigma}^2_{rs} = \frac{1}{n}\sum^n_{t=1}[u_t(u_t-c_t)+d_t(d_t-c_t)] \qquad u_t=ln(H_t/)_t) \qquad d_t=ln(L_t/O_t)$$
$$k=\frac{0.34}{1.34+(n+1)/(n-1)}$$
quantity k is chosen to minimize the variance of the estimator $\hat{\sigma}^2_{yz}$.

Left out sections: 3.7-3.11, 3.13-3.14, 3.16