labs/linear/qr_and_regression.Rmd at master · genomicsclass/labs

189 lines (132 loc) · 5.02 KB
title: "The QR decomposition"
author: "Rafa"
date: "January 31, 2015"
output: html_document
layout: page
```{r options, echo=FALSE}
library(knitr)
opts_chunk$set(fig.path=paste0("figure/", sub("(.*).Rmd","\\1",basename(knitr:::knit_concord$get('infile'))), "-"))
## The QR Factorization (Advanced)
We have seen that in order to calculate the LSE, we need to invert a matrix. In previous sections we used the function `solve`. However, solve is not a stable solution. When coding LSE computation, we use the QR decomposition.
#### Inverting $\mathbf{X^\top X}$
Remember that to minimize the RSS: 
(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^\top
(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})
We need to solve:
\mathbf{X}^\top \mathbf{X} \boldsymbol{\hat{\beta}} = \mathbf{X}^\top \mathbf{Y}   
The solution is: 
\boldsymbol{\hat{\beta}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{Y}   
Thus, we need to compute $(\mathbf{X}^\top \mathbf{X})^{-1}$.
#### `solve` is numerically unstable
To demonstrate what we mean by _numerically unstable_,  we construct an extreme case:
n <- 50;M <- 500
x <- seq(1,M,len=n)
X <- cbind(1,x,x^2,x^3)
colnames(X) <- c("Intercept","x","x2","x3")
beta <- matrix(c(1,1,1,1),4,1)
set.seed(1)
y <- X%*%beta+rnorm(n,sd=1)
The standard R function for inverse gives an error:
```{r,eval=FALSE}
solve(crossprod(X)) %*% crossprod(X,y)
To see why this happens, look at $(\mathbf{X}^\top \mathbf{X})$
options(digits=4)
log10(crossprod(X))
Note the difference of several orders of magnitude. On a digital computer, we have a limited range of numbers. This makes some numbers seem like 0, when we also have to consider very large numbers. This in turn leads to  divisions that are practically divisions by 0 errors.
#### The factorization 
The QR factorization is based on a mathematical result that tells us that we can decompose any full rank $N\times p$ matrix $\mathbf{X}$ as:
\mathbf{X = QR}
* $\mathbf{Q}$ a $N \times p$ matrix with  $\mathbf{Q^\top Q=I}$
* $\mathbf{R}$ a $p \times p$ upper triangular matrix.
Upper triangular matrices are very convenient for solving system of equations.
#### Example of upper triangular matrix
In the example below, the matrix on the left is upper triangular: it only has 0s below the diagonal.
This facilitates solving the system of equations greatly:
\begin{pmatrix}
\end{pmatrix}
\begin{pmatrix}
\end{pmatrix}
\begin{pmatrix}
\end{pmatrix}
We immediately know that $c=1$, which implies that $b+2=4$. This in turn implies $b=2$ and thus $a+4-1=6$ so $a = 3$. Writing an algorithm to do this is straight-forward for any upper triangular matrix.
#### Finding LSE with QR 
If we rewrite the equations of the LSE using $\mathbf{QR}$ instead of $\mathbf{X}$ we have:
$$\mathbf{X}^\top \mathbf{X} \boldsymbol{\beta} = \mathbf{X}^\top \mathbf{Y}$$
$$(\mathbf{Q}\mathbf{R})^\top (\mathbf{Q}\mathbf{R}) \boldsymbol{\beta} = (\mathbf{Q}\mathbf{R})^\top \mathbf{Y}$$
$$\mathbf{R}^\top (\mathbf{Q}^\top \mathbf{Q}) \mathbf{R} \boldsymbol{\beta} = \mathbf{R}^\top \mathbf{Q}^\top \mathbf{Y}$$
$$\mathbf{R}^\top \mathbf{R} \boldsymbol{\beta} = \mathbf{R}^\top \mathbf{Q}^\top \mathbf{Y}$$
$$(\mathbf{R}^\top)^{-1} \mathbf{R}^\top \mathbf{R} \boldsymbol{\beta} = (\mathbf{R}^\top)^{-1} \mathbf{R}^\top \mathbf{Q}^\top \mathbf{Y}$$
$$\mathbf{R} \boldsymbol{\beta} = \mathbf{Q}^\top \mathbf{Y}$$
$\mathbf{R}$ being upper triangular makes solving this more stable. Also, because $\mathbf{Q}^\top\mathbf{Q}=\mathbf{I}$ , we know that the columns of $\mathbf{Q}$ are in the same scale which stabilizes the right side. 
Now we are ready to find LSE using the QR decomposition. To solve:
$$\mathbf{R} \boldsymbol{\beta} = \mathbf{Q}^\top \mathbf{Y}$$
We use `backsolve` which takes advantage of the upper triangular nature of $\mathbf{R}$.
QR <- qr(X)
Q <- qr.Q( QR )
R <- qr.R( QR )
(betahat <- backsolve(R, crossprod(Q,y) ) )
In practice, we do not need to do any of this due to the built-in `solve.qr` function:
QR <- qr(X)
(betahat <- solve.qr(QR, y))
#### Fitted values
This factorization also simplifies the calculation for fitted values:
$$\mathbf{X}\boldsymbol{\hat{\beta}} = 
(\mathbf{QR})\mathbf{R}^{-1}\mathbf{Q}^\top \mathbf{y}= \mathbf{Q}\mathbf{Q}^\top\mathbf{y} $$
In R, we simply do the following:
```{r,eval=FALSE}
```{r,fig.width=5,fig.height=5,fig.align="center"}
library(rafalib)
fitted <- tcrossprod(Q)%*%y
lines(x,fitted,col=2)
#### Standard errors
To obtain the standard errors of the LSE, we note that:
$$(\mathbf{X^\top X})^{-1} = (\mathbf{R^\top Q^\top QR})^{-1} = (\mathbf{R^\top R})^{-1}$$
The function `chol2inv` is specifically designed to find this inverse. So all we do is the following:
df <- length(y) - QR$rank
sigma2 <- sum((y-fitted)^2)/df
varbeta <- sigma2*chol2inv(qr.R(QR))
SE <- sqrt(diag(varbeta))
cbind(betahat,SE)
This gives us identical results to the `lm` function.
summary(lm(y~0+X))$coef
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

qr_and_regression.Rmd

Latest commit

History

qr_and_regression.Rmd

File metadata and controls