Householder transformation

In linear algebra, a Householder transformation (also known as a Householder reflection or elementary reflector) is a linear transformation that describes a reflection about a plane or hyperplane containing the origin. The Householder transformation was used in a 1958 paper by Alston Scott Householder.

Definition

Operator and transformation

The Householder operator may be defined over any finite-dimensional inner product space <math> V</math> with inner product <math> \langle \cdot, \cdot \rangle </math> and unit vector <math> u\in V</math> as

:<math> H_u(x) := x - 2\,\langle x,u \rangle\,u\,.</math>

It is also common to choose a non-unit vector <math>q \in V</math>, and normalize it directly in the Householder operator's expression:

:<math>H_q \left ( x \right ) = x - 2\, \frac{\langle x, q \rangle}{\langle q, q \rangle}\, q \,.</math>

Such an operator is linear and self-adjoint.

If <math>V=\mathbb{C}^n</math>, note that the reflection hyperplane can be defined by its normal vector, a unit vector <math display="inline">\vec v\in V</math> (a vector with length <math display="inline">1</math>) that is orthogonal to the hyperplane. The reflection of a point <math display="inline">x</math> about this hyperplane is the Householder transformation:

: <math>\vec x - 2\langle \vec x, \vec v\rangle \vec v = \vec x - 2\vec v\left(\vec v^* \vec x\right), </math>

where <math>\vec x</math> is the vector from the origin to the point <math>x</math>, and <math display="inline">\vec v^*</math> is the conjugate transpose of <math display="inline">\vec v</math>.

thumb|400px|The Householder transformation acting as a reflection of <math>x</math> about the hyperplane defined by <math>v</math>.

Householder matrix

The matrix constructed from this transformation can be expressed in terms of an outer product as:

: <math>P = I - 2\vec v\vec v^*</math>

is known as the Householder matrix, where <math display="inline">I</math> is the identity matrix.

Properties

The Householder matrix has the following properties:

it is Hermitian: <math display="inline">P = P^*</math>,
it is unitary: <math display="inline">P^{-1} = P^*</math> (via the Sherman-Morrison formula),
hence it is involutory: <math display="inline">P = P^{-1}</math>.
A Householder matrix has eigenvalues <math display="inline">\pm 1</math>. To see this, notice that if <math display="inline">\vec x</math> is orthogonal to the vector <math display="inline">\vec v</math> which was used to create the reflector, then <math display="inline">P_v\vec x = (I-2\vec v\vec v^*)\vec x = \vec x-2\langle\vec v,\vec x\rangle\vec v = \vec x</math>, i.e., <math display="inline">1</math> is an eigenvalue of multiplicity <math display="inline">n - 1</math>, since there are <math display="inline">n - 1</math> independent vectors orthogonal to <math display="inline">\vec v</math>. Also, notice <math display="inline">P_v\vec v = (I-2\vec v\vec v^*)\vec v = \vec v - 2\langle\vec v,\vec v\rangle\vec v = -\vec v</math> (since <math>\vec v</math> is by definition a unit vector), and so <math display="inline">-1</math> is an eigenvalue with multiplicity <math display="inline">1</math>.
The determinant of a Householder reflector is <math display="inline">-1</math>, since the determinant of a matrix is the product of its eigenvalues, in this case one of which is <math display="inline">-1</math> with the remainder being <math display="inline">1</math> (as in the previous point), or via the Matrix determinant lemma.

Example

Consider the normalization of a vector <math>\vec v</math> containing <math>1</math> in each entry,

:<math>\vec v=\frac{1}{\sqrt{2\begin{bmatrix} 1\\1 \end{bmatrix}.</math>

Then the Householder matrix corresponding to the vector <math>v</math> is

:<math>P_v=\begin{bmatrix}1&0\\0&1\end{bmatrix}-2\left(\frac{1}{\sqrt{2\begin{bmatrix} 1\\1 \end{bmatrix}\right)\left(\frac{1}{\sqrt{2\begin{bmatrix} 1&1 \end{bmatrix}\right)</math>

:<math>\quad=\begin{bmatrix}1&0\\0&1\end{bmatrix}-\begin{bmatrix} 1\\1 \end{bmatrix}\begin{bmatrix} 1&1 \end{bmatrix}</math>

:<math>\quad=\begin{bmatrix}1&0\\0&1\end{bmatrix}-\begin{bmatrix}1&1\\1&1\end{bmatrix}</math>

:<math>\quad=\begin{bmatrix}0&-1\\-1&0\end{bmatrix}.</math>

Note that if we have another vector <math>\vec q</math> representing a coordinate in the 2D plane

:<math>\vec q = \begin{bmatrix}x\\y\end{bmatrix},</math>

then in this case <math>P_v</math> flips and negates the <math>x</math> and <math>y</math> coordinates, in other words we have

:<math>P_v\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-y\\-x\end{bmatrix},</math>

which corresponds to reflecting the vector across the line <math>y=-x</math>, which our original vector <math>\vec v</math> is normal to.

Applications

Geometric optics

In geometric optics, specular reflection can be expressed in terms of the Householder matrix (see ').

Numerical linear algebra

Note that representing a Householder matrix requires only the entries of a single vector, not of an entire matrix (which in most algorithms is never explicitly formed), thereby minimizing the required storage and memory references needed to use them.

Further, multiplying a Householder matrix by a vector does not involve a full matrix-vector multiplication, but rather only one vector dot product, and then one axpy operation. This means its arithmetic complexity is of the same order of two low-level BLAS-1 operations. Therefore, Householder matrices are extremely arithmetically efficient.

Finally, using <math>\hat{\cdot}</math> to denote the computed value and <math>\cdot</math> to denote the mathematically exact value, then for a given Householder matrix <math>P</math>,

<math>\widehat{P b}=(P+\Delta P)b</math>

Where <math>\vert\vert\Delta P\vert\vert_F\leq\tilde{\gamma_n}:=\frac{cnu}{1-cnu}</math> (where <math>u</math> is unit roundoff, <math>n</math> the size of the matrix <math>P</math>, and <math>c</math> some small constant). In other words, multiplications by Householder matrices are also extremely backwards stable.

Since Householder transformations minimize storage, memory references, arithmetic complexity, and optimize numerical stability, they are widely used in numerical linear algebra, for example, to annihilate the entries below the main diagonal of a matrix, to perform QR decompositions and in the first step of the QR algorithm. They are also widely used for transforming to a Hessenberg form. For symmetric or Hermitian matrices, the symmetry can be preserved, resulting in tridiagonalization.

QR decomposition

Householder transformations can be used to calculate a QR decomposition. Consider a matrix triangularized up to column <math>i</math>, then our goal is to construct such Householder matrices that act upon the principal submatrices of a given matrix

<math>

\begin{bmatrix}

a_{11} & a_{12} & \cdots & & & a_{1n} \\

0 & a_{22} & \cdots & & & a_{1n} \\

\vdots & & \ddots & & & \vdots \\

0 & \cdots & 0 & x_{1}=a_{ii} & \cdots & a_{in} \\

0 & \cdots & 0 & \vdots & & \vdots \\

0 & \cdots & 0 & x_{n+1-i}=a_{ni} & \cdots & a_{nn}

\end{bmatrix}

</math>

via the matrix

<math>

\begin{bmatrix}

I_{i-1}&0\\

0&P_v

\end{bmatrix}

</math>.

(note that we already established before that Householder transformations are unitary matrices, and since the multiplication of unitary matrices is itself a unitary matrix, this gives us the unitary matrix of the QR decomposition)

If we can find a <math>\vec v</math> so that

P_v \vec{x} = \alpha\vec{e_1}

</math>

we could accomplish this. Thinking geometrically, we are looking for a plane so that the reflection about this plane happens to land directly on the basis vector. In other words,

for some constant <math>\alpha</math>. However, for this to happen, we must have

\vec v\propto\vec x-\alpha\vec e_1 \text{.}

</math>

And since <math>\vec v</math> is a unit vector, this means that we must have

Now if we apply equation () back into equation (), we get

\vec x-\alpha\vec e_1 =

2 \left\langle \vec{x}, \frac{ \vec{x}-\alpha\vec{e}_1 }{ \|\vec{x}-\alpha\vec{e}_1\|_2 } \right\rangle

\frac{ \vec{x}-\alpha\vec{e}_1 }{ \|\vec x-\alpha\vec e_1\|_2 }

</math>

Or, in other words, by comparing the scalars in front of the vector <math>\vec x - \alpha\vec e_1</math> we must have

\|\vec x-\alpha\vec e_1\|_2^2 = 2\langle\vec x,\vec x-\alpha e_1\rangle \text{.}

</math>

\|\vec x\|_2^2-2\alpha x_1+\alpha^2 = 2(\| \vec x\|_2^2-\alpha x_1)

</math>

Which means that we can solve for <math>\alpha</math> as

\alpha = \pm\|\vec x\|_2

</math>

This completes the construction; however, in practice we want to avoid catastrophic cancellation in equation (). To do so, we choose

In the first step, to form the Householder matrix in each step we need to determine <math display="inline">\alpha</math> and <math display="inline">r</math>, which are:

:<math>\begin{align}

\alpha &= -\operatorname{sgn}\left(a_{21}\right)\sqrt{\sum_{j=2}^n a_{j1}^2}; \\

r &= \sqrt{\frac{1}{2}\left(\alpha^2 - a_{21}\alpha\right)};

\end{align}</math>

From <math display="inline">\alpha</math> and <math display="inline">r</math>, construct vector <math display="inline">v</math>:

:<math>\vec v^{(1)} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix},</math>

where <math display="inline">v_1 = 0</math>, <math display="inline">v_2 = \frac{a_{21} - \alpha}{2r}</math>, and

:<math>v_k = \frac{a_{k1{2r}</math> for each <math>k = 3, 4 \ldots n</math>

Then compute:

:<math>\begin{align}

P^1 &= I - 2\vec v^{(1)} \left(\vec v^{(1)}\right)^\textsf{T} \\

A^{(2)} &= P^1 AP^1

\end{align}</math>

Having found <math display="inline">P^1</math> and computed <math display="inline">A^{(2)}</math> the process is repeated for <math display="inline">k = 2, 3, \ldots, n - 2</math> as follows:

:<math>\begin{align}

\alpha &= -\operatorname{sgn}\left(a^k_{k+1,k}\right)\sqrt{\sum_{j=k+1}^n \left(a^k_{jk}\right)^2} \\[2pt]

r &= \sqrt{\frac{1}{2}\left(\alpha^2 - a^k_{k+1,k}\alpha\right)} \\[2pt]

v^k_1 &= v^k_2 = \cdots = v^k_k = 0 \\[2pt]

v^k_{k+1} &= \frac{a^k_{k+1,k} - \alpha}{2r} \\

v^k_j &= \frac{a^k_{jk{2r} \text{ for } j = k + 2,\ k + 3,\ \ldots,\ n \\

P^k &= I - 2\vec v^{(k)} \left(\vec v^{(k)}\right)^\textsf{T} \\

A^{(k+1)} &= P^k A^{(k)}P^k

\end{align}</math>

Continuing in this manner, the tridiagonal and symmetric matrix is formed.

Examples

In this example, also from Burden and Faires,

Finally we note that a single Householder transform, unlike a solitary Givens transform, can act on all columns of a matrix, and as such exhibits the lowest computational cost for QR decomposition and tridiagonalization. The penalty for this "computational optimality" is, of course, that Householder operations cannot be as deeply or efficiently parallelized. As such Householder is preferred for dense matrices on sequential machines, whilst Givens is preferred on sparse matrices, and/or parallel machines.

Householder transformation

Definition

Operator and transformation

Householder matrix

Properties

Example

Applications

Geometric optics

Numerical linear algebra

QR decomposition

Examples

See also

Notes