## Thursday, September 3, 2015

### Multivariate Normal distribution

This was forthcoming, especially, if you want to understand Kalman filter.

A $k$-dimensional random vector $\pmb{x}=(x_1,...,x_k)^T$ follows a multivariate normal distribution with mean $\pmb{\mu}=(\mu_1,...,\mu_k)^T$ and positive-definite covariance matrix $\pmb{\Sigma}=[\sigma_{ij}]$ if its probability density function is $$f(x|\mu, \Sigma)=\frac{1}{(2\pi)^{k/2}|\Sigma|^{1/2}}e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)}.$$ This is denoted by $x\sim N_k(\mu,\Sigma).$ A square matrix $A (m\times m)$ is a positive-definite matrix if $A$ is symmetric, and all eigenvalues of $A$ are positive. Alternatively, $A$ is a positive-definite matrix if for any nonzero $m$-dimensional vector $b$, we have $b^TAb>0.$ For a positive-definite matrix $A$ all eigenvalues are positive and matrix can be decomposed as $A=P\Lambda P^T,$ where $\lambda$ is a diagonal matrix consisting of all eigenvalues of $A$ and $P$ is an $m\times m$ matrix consisting of the $m$ right eigenvectors of $A$, making $P$ an orthogonal matrix, if eigenvalues are distinct.

For a symmetric matrix $A$, there exists a lower triangular matrix $L$ with diagonal elements being 1 and a diagonal matrix $G$ such that $A=LGL^T$. If $A$ is positive definite, then the diagonal elements of G are positive. In this case we can write $A=(L\sqrt{G})(L\sqrt{G})^T$, where $L\sqrt{G}$ again is a lower triangle matrix. Such a decomposition is called Cholesky decomposition of $A$. This shows that a positive-definite matrix $A$ can be diagonalized as $L^{-1}A(L^T)^{-1}=L^{-1}A(L^{-1})^T=G.$

Let $c=[c_1,...,c_k]^T$ be a nonzero vector partitioned as $x=[x_1^T,x_2^T]^T$, with the first of size $p$ and the second of size $k-p$ such that, $$\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \sim N\left( \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}, \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22}\end{bmatrix} \right).$$ Some properties of $x$ are:

1. $c^Tx \sim N\left( c^T\mu, c^T\Sigma c \right)$, any nonzero linear combination of $x$ is univariate normal and vice-versa.
2. The marginal distribution of $x_i$ is normal, $x_i \sim N_k \left( \mu_i, \Sigma_{ii}\right)$.
3. $\Sigma_{12}=0$ if an only if $x_1$ and $x_2$ are independent.
4. The variable $(x-\mu)^T\Sigma^{-1}(x-\mu)$ follows a chi-squared distribution with $m$ degrees of freedom.
5. The conditional distribution of $x_1$ given $x_2=b$ is also normally distributed as $$(x_1|x_2=b)\sim N \left( \mu_1+\Sigma_{12}\Sigma_{22}^{-1}(b-\mu_2), \Sigma_{11}-\Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21} \right).$$
Suppose that $x$, $y$, and $z$ are three random vectors such that their joint distribution is multivariate normal. In addition, assume that the diagonal block covariance matrix $\Sigma_{ww}$ is nonsingular for $w=x,y,z$, and $\Sigma_{yz}=0$. Then,

1. $(x|y) \sim N \left( \mu_x+\Sigma_{xy}\Sigma_{yy}^{-1}(y-\mu_y), \Sigma_{xx}-\Sigma_{xx}\Sigma_{yy}^{-1}\Sigma_{yx}\right)$
2. $(x|y,z) \sim N\left( E(x|y)+\Sigma_{xz}\Sigma_{zz}^{-1}(z-\mu_z), Var(x|y)-\Sigma_{xz}\Sigma_{zz}^{-1}\Sigma_{zx}\right)$