CLS361 Programming assignment 4:


  1. Develop a Matlab program for the $ {\bf QR}$ algorihtm based on Householder reflections for real $ m \times n$ matrices. Carry out the $ {\bf QR}$ decomposiiton for random matrices (order of $ 30 \times 30$) and compare you results with the native $ qr$ available in Matlab.
  2. Develop a function for reducing a real square symmetric matrix to a tridiagonal form using Householder based similarity transformations.
  3. Suppose that the

    $\displaystyle function:\;\;{\bf v} = house({\bf x})
$

    discussed in the class returns the approximation $ {\bf\hat{v}}$ for the Householder vector $ {\bf v}$ after round-offs. If

    $\displaystyle {\bf\hat{P}} = {\bf I} - 2 \frac{\bf\hat{v}\hat{v}^T}{\bf\hat{v}^T\hat{v}}
$

    then show/verify the following:
    1. $ \Vert {\bf P} - {\bf\hat{P}} \Vert _2 = O(\mu)$
    2. $ fl({\bf\hat{P}}{\bf A}) = {\bf P}({\bf A} + {\bf E})$, where $ \Vert{\bf E}\Vert _2 = O(\mu \Vert{\bf A}\Vert _2)$
    3. $ fl({\bf A}{\bf\hat{P}}) = ({\bf A} + {\bf E}){\bf P}$, where $ \Vert{\bf E}\Vert _2 = O(\mu \Vert{\bf A}\Vert _2)$
  4. Program the following $ {\bf QR}$ iteration (read the note at the end of this assignment) for computing the eigenvalues and eigenvectors of a general real matrix of size $ n \times n$. For symmetric matrices, first carry out a tridiagonalization. Show/verify that the $ {\bf QR}$ iteration preserves the tridiagonal structure for a symmetric matrix. Test your results on randomly generated matrices by comparing with standard Matlab functions.
  5. Use your eigen-computation routine to compute the SVD of an $ m \times n$ matrix $ {\bf A}$. Note that if the SVD of $ {\bf A}$ is

    $\displaystyle {\bf A} = {\bf U}{\bf\Sigma}{\bf V}^T
$

    where

    $\displaystyle {\bf\Sigma} = diag(\sigma_1,\sigma_2,\ldots,\sigma_n)
$

    then

    $\displaystyle {\bf V}^T ({\bf A}^T{\bf A}) {\bf V} = diag(\sigma_1^2,\sigma_2^2,\ldots \sigma_n^2) \in \mathbb{R}^{n \times n}
$

    and

    $\displaystyle {\bf U}^T ({\bf A}{\bf A}^T) {\bf U} = diag(\sigma_1^2,\sigma_2^2,\ldots \sigma_n^2,0,\ldots,0) \in \mathbb{R}^{m \times m}
$

    Moreover, if

    $\displaystyle {\bf U} = \left[ \begin{array}{cc}{\bf U_1} & {\bf U_2} \end{array} \right]
$

    where $ {\bf U_1}$ is the matrix of the first $ n$ columns of $ {\bf U}$ and $ {\bf U_2}$ is the matrix of the last $ m-n$ columns of $ {\bf U}$, and we define

    $\displaystyle {\bf Q} = \frac{1}{\sqrt{2}}
\left[
\begin{array}{ccc}
{\bf V} & {\bf V} & {\bf0} \\
{\bf U_1}& -{\bf U_1} & \sqrt{2}{\bf U_2}
\end{array}\right]
$

    then

    $\displaystyle {\bf Q}^T \left[
\begin{array}{cc}
{\bf0} & {\bf A}^T \\
{\bf A}...
...s,\sigma_n,-\sigma_1,-\sigma_2,\ldots,-\sigma_n,\underbrace{0,\ldots,0}_{m-n})
$

    Compare your SVD computation with that of the standard Matlab function for moderate size matrices.

  6. Consider a straight line $ ax + by -c = 0$ for any choice of parameters $ a,b,c$. Randomly select 100 points on the straight line and shift these by adding 2-dimensional isotropic Gaussian random noise (you can add zero mean Gaussian noise of variance $ \sigma$ independently to $ x$ and $ y$). Vary $ \sigma$ and obtain least square line fits using your SVD routine. Add comments/analysis in a separate file.

Note: The $ {\bf QR}$ iteration consists of a sequence of orthogonal transformations

\begin{displaymath}
\begin{array}{rrrr}
{\bf A_k} & = & {\bf Q_k}{\bf R_k} \\
{...
...f R_k}{\bf Q_k} & (= {\bf Q_k}^T{\bf R_k}{\bf Q_k})
\end{array}\end{displaymath}

The following (non-obvious) theorem is the basis of the algorithm for a real matrix $ {\bf A}$:

  1. If $ {\bf A}$ has eigenvalues of distinct absolute value $ \mid \lambda_i \mid$, then $ {\bf A_k} \rightarrow$ [upper-triangular form] as $ k \rightarrow
\infty$. The eigenvalues appear on the diagonal in increasing order of absolute maginitude.
  2. If $ {\bf A}$ has eigenvalue $ \mid \lambda_i \mid$ of multiplicity $ p$, then $ {\bf A_k} \rightarrow$ [upper-triangular form] as $ k \rightarrow
\infty$, except for a diagonal block matrix of order $ p$, whose eigenvalues $ \rightarrow \lambda_i$.

If $ {\bf A}$ is real $ n \times n$ then there exists an orthogonal $ {\bf Q} \in \mathbb{R}^{n \times n}$ such that

$\displaystyle {\bf Q}^T{\bf A}{\bf Q} =
\left[
\begin{array}{cccc}
{\bf R_{11}}...
...\ddots & \ldots \\
{\bf0} & {\bf0} & \ldots & {\bf R_{mm}}
\end{array}\right]
$

where each $ {\bf R_{ii}}$ is either a $ 1 \times 1$ matrix or a $ 2 \times 2$ matrix having complex conjugate eigenvalues.

The above result is called the Real Schur decomposition theorem.

The $ {\bf QR}$ iteration for general real matrices converges to a a Real Schur form. It is customary to start with $ {\bf H_0} = {\bf U_0}^T{\bf A}{\bf U_0}$ such that $ {\bf H_0}$ is in upper Hessenberg form ( $ h_{ij} = 0, i > j+1$). A real square matrix can be reduced to the Hessenberg form using Householder operations. The Hessenberg form is preserved by the $ {\bf QR}$ iteration.

For a real symmetric matrix all eigenvalues are real. If $ \lambda_i$ has a multiplicity of $ p$ then the matrix can be split in to sub-matrices which can be diagonalized separately.

In the proof of the theorem quoted above, one finds that in general a sub-diagonal element converges to 0 like

$\displaystyle a_{ij}^{(k)} \sim \left( \frac{\lambda_i}{\lambda_j} \right)^k
$

Although $ \lambda_i < \lambda_j$, convergence can be slow if they are close. Convergence can be accelerated by a technique called shifting: if $ s$ is any constant, then $ {\bf A} - s{\bf I}$ has eigenvalues $ \lambda_i - s$. If we decompose

\begin{displaymath}
\begin{array}{lll}
{\bf A_k} - s_k{\bf I} & = & {\bf Q_k}{\b...
...s_k{\bf I} \\
& = & {\bf Q_k}^T{\bf R_k}{\bf Q_k}
\end{array}\end{displaymath}

then, the convergence is determined by the ratio

$\displaystyle \frac{\lambda_i - s_k}{\lambda_j - s_k}
$

The idea is to choose the shift $ s_k$ at each stage to maximize the rate of convergence. A good choice for the shift initially would be $ s_k$ close to $ \lambda_1$, the smallest eigenvalue. Then the first row of off-diagonal elements would tend rapidly to zero. However $ \lambda_1$ is not known apriori. A very effective strategy is to compute the eigenvalues of the leading $ 2 \times 2$ sub-matrix of $ {\bf A_k}$ at each stage and set $ s_k$ to the smaller. One can show that the convergence of the algorithm with this strategy is generally cubic (and at worst quadratic) for degenerate eigenvalues.

Good luck.


Subhashis Banerjee 2009-11-09