The Schur complement is a key tool in the fields of linear algebra, the theory of matrices, numerical analysis, and statistics.
It is defined for a block matrix. Suppose p, q are nonnegative integers such that p + q > 0, and suppose A, B, C, D are respectively p × p, p × q, q × p, and q × q matrices of complex numbers. Let
<math display="block">M = \begin{bmatrix} A & B \\ C & D \end{bmatrix}</math>
so that M is a (p + q) × (p + q) matrix.
If D is invertible, then the Schur complement of the block D of the matrix M is the p × p matrix defined by
<math display="block">M/D := A - BD^{-1}C.</math>
If A is invertible, the Schur complement of the block A of the matrix M is the q × q matrix defined by
<math display="block">M/A := D - CA^{-1}B.</math>
In the case that A or D is singular, substituting a generalized inverse for the inverses on M/A and M/D yields the generalized Schur complement.
The Schur complement is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously. Emilie Virginia Haynsworth was the first to call it the Schur complement. The Schur complement is sometimes referred to as the Feshbach map after a physicist Herman Feshbach.
Background
The Schur complement arises when performing a block Gaussian elimination on the matrix M. In order to eliminate the elements below the block diagonal, one multiplies the matrix M by a block lower triangular matrix on the right as follows:
<math display="block">\begin{align}
&M = \begin{bmatrix} A & B \\ C & D \end{bmatrix} \quad \to \quad \begin{bmatrix} A & B \\ C & D \end{bmatrix} \begin{bmatrix} I_p & 0 \\ -D^{-1}C & I_q \end{bmatrix} = \begin{bmatrix} A - BD^{-1}C & B \\ 0 & D \end{bmatrix},
\end{align}</math>
where I<sub>p</sub> denotes a p×p identity matrix. As a result, the Schur complement <math>M/D = A - BD^{-1}C</math> appears in the upper-left p×p block.
Continuing the elimination process beyond this point (i.e., performing a block Gauss–Jordan elimination),
<math display="block">\begin{align}
&\begin{bmatrix} A - BD^{-1}C & B \\ 0 & D \end{bmatrix} \quad \to \quad \begin{bmatrix} I_p & -BD^{-1} \\ 0 & I_q \end{bmatrix} \begin{bmatrix} A - BD^{-1}C & B \\ 0 & D \end{bmatrix}
= \begin{bmatrix} A - BD^{-1}C & 0 \\ 0 & D \end{bmatrix},
\end{align}</math>
leads to an LDU decomposition of M, which reads
<math display="block">\begin{align}
M &= \begin{bmatrix} A & B \\ C & D \end{bmatrix}
= \begin{bmatrix} I_p & BD^{-1} \\ 0 & I_q \end{bmatrix}\begin{bmatrix} A - BD^{-1}C & 0 \\ 0 & D \end{bmatrix}\begin{bmatrix} I_p & 0 \\ D^{-1}C & I_q \end{bmatrix}.
\end{align}</math>
Thus, the inverse of M may be expressed involving D<sup>−1</sup> and the inverse of Schur's complement, assuming it exists, as
<math display="block">\begin{align}
M^{-1} = \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1}
={} &\left(\begin{bmatrix} I_p & BD^{-1} \\ 0 & I_q \end{bmatrix}
\begin{bmatrix} A - BD^{-1}C & 0 \\ 0 & D \end{bmatrix}
\begin{bmatrix} I_p & 0 \\ D^{-1}C & I_q \end{bmatrix}
\right)^{-1} \\
={} &\begin{bmatrix} I_p & 0 \\ -D^{-1}C & I_q \end{bmatrix}
\begin{bmatrix} \left(A - BD^{-1}C\right)^{-1} & 0 \\ 0 & D^{-1} \end{bmatrix}
\begin{bmatrix} I_p & -BD^{-1} \\ 0 & I_q \end{bmatrix} \\[4pt]
={} &\begin{bmatrix}
\left(A - BD^{-1}C\right)^{-1} & -\left(A - BD^{-1}C\right)^{-1} BD^{-1} \\
-D^{-1}C\left(A - BD^{-1}C\right)^{-1} & D^{-1} + D^{-1}C\left(A - BD^{-1}C\right)^{-1}BD^{-1}
\end{bmatrix} \\[4pt]
={} &\begin{bmatrix}
\left(M/D\right)^{-1} & -\left(M/D\right)^{-1} B D^{-1} \\
-D^{-1}C\left(M/D\right)^{-1} & D^{-1} + D^{-1}C\left(M/D\right)^{-1} B D^{-1}
\end{bmatrix}.
\end{align}</math>
The above relationship comes from the elimination operations that involve D<sup>−1</sup> and M/D. An equivalent derivation can be done with the roles of A and D interchanged. By equating the expressions for M<sup>−1</sup> obtained in these two different ways, one can establish the matrix inversion lemma, which relates the two Schur complements of M: M/D and M/A (see "Derivation from LDU decomposition" in ).
Properties
- If p and q are both 1 (i.e., A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:
: <math> M^{-1} = \frac{1}{AD-BC} \left[ \begin{matrix} D & -B \\ -C & A \end{matrix}\right] </math>
: provided that AD − BC is non-zero.
- In general, if A is invertible, then
: <math>\begin{align}
M &= \begin{bmatrix} A&B\\C&D \end{bmatrix} =
\begin{bmatrix} I_p & 0 \\ CA^{-1} & I_q \end{bmatrix}\begin{bmatrix} A & 0 \\ 0 & D - CA^{-1}B \end{bmatrix}\begin{bmatrix} I_p & A^{-1}B \\ 0 & I_q \end{bmatrix}, \\[4pt]
M^{-1} &= \begin{bmatrix} A^{-1} + A^{-1} B (M/A)^{-1} C A^{-1} & - A^{-1} B (M/A)^{-1} \\ - (M/A)^{-1} CA^{-1} & (M/A)^{-1} \end{bmatrix}
\end{align}</math>
: whenever this inverse exists.
- (Schur's formula) When A, respectively D, is invertible, the determinant of M is also clearly seen to be given by
: <math>\det(M) = \det(A) \det\left(D - CA^{-1} B\right)</math>, respectively
: <math>\det(M) = \det(D) \det\left(A - BD^{-1} C\right)</math>,
: which generalizes the determinant formula for 2 × 2 matrices.
- (Guttman rank additivity formula) If D is invertible, then the rank of M is given by
: <math> \operatorname{rank}(M) = \operatorname{rank}(D) + \operatorname{rank}\left(A - BD^{-1} C\right)</math>
- (Haynsworth inertia additivity formula) If A is invertible, then the inertia of the block matrix M is equal to the inertia of A plus the inertia of M/A.
- (Quotient identity) <math>A/B = ((A/C)/(B/C))</math>.
- The Schur complement of a Laplacian matrix is also a Laplacian matrix.
Application to solving linear equations
The Schur complement arises naturally in solving a system of linear equations such as
:<math>\begin{align}
\operatorname{Cov}(X \mid Y) &= A - BC^{-1}B^\mathrm{T} \\
\operatorname{E}(X \mid Y) &= \operatorname{E}(X) + BC^{-1}(Y - \operatorname{E}(Y))
\end{align}</math>
If we take the matrix <math>\Sigma</math> above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in <math>\Sigma</math> also has a Wishart distribution.
Conditions for positive definiteness and semi-definiteness
Let X be a symmetric matrix of real numbers given by
<math display="block">X = \left[\begin{matrix} A & B \\ B^\mathrm{T} & C\end{matrix}\right].</math>
Then by the Haynsworth inertia additivity formula, we find
- If A is invertible, then X is positive definite if and only if A and its complement X/A are both positive definite:
:<math display="block">X \succ 0 \Leftrightarrow A \succ 0, X/A = C - B^\mathrm{T} A^{-1} B \succ 0.</math>
- If C is invertible, then X is positive definite if and only if C and its complement X/C are both positive definite:
:<math display="block">X \succ 0 \Leftrightarrow C \succ 0, X/C = A - B C^{-1} B^\mathrm{T} \succ 0.</math>
- If A is positive definite, then X is positive semi-definite if and only if the complement X/A is positive semi-definite: by considering the minimizer of the quantity
<math display="block">u^\mathrm{T} A u + 2 v^\mathrm{T} B^\mathrm{T} u + v^\mathrm{T} C v, \,</math>
as a function of v (for fixed u).
Furthermore, since
<math display="block">
\left[\begin{matrix} A & B \\ B^\mathrm{T} & C \end{matrix}\right] \succ 0
\Longleftrightarrow \left[\begin{matrix} C & B^\mathrm{T} \\ B & A \end{matrix}\right] \succ 0
</math>
and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third) statement.
There is also a sufficient and necessary condition for the positive semi-definiteness of X in terms of a generalized Schur complement.
