Min-max theorem

In linear algebra and functional analysis, the min-max theorem, or variational theorem, or Courant–Fischer–Weyl min-max principle, is a result that gives a variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces. It can be viewed as the starting point of many results of similar nature.

This article first discusses the finite-dimensional case and its applications before considering compact operators on infinite-dimensional Hilbert spaces.

We will see that for compact operators, the proof of the main theorem uses essentially the same idea from the finite-dimensional argument.

In the case that the operator is non-Hermitian, the theorem provides an equivalent characterization of the associated singular values.

The min-max theorem can be extended to self-adjoint operators that are bounded below.

Matrices

Let be a Hermitian matrix. As with many other variational results on eigenvalues, one considers the Rayleigh–Ritz quotient defined by

:<math>R_A(x) = \frac{(Ax, x)}{(x,x)}</math>

where denotes the Euclidean inner product on .

The Rayleigh quotient of an eigenvector <math>v</math> is its associated eigenvalue <math>\lambda</math> because <math>R_A(v) = (\lambda x, x)/(x, x) = \lambda</math>.

For a Hermitian matrix A, the range of the continuous functions R<sub>A</sub>(x) is a compact interval [a, b] of the real line. The maximum b and the minimum a are the largest and smallest eigenvalue of A, respectively. The min-max theorem is a refinement of this fact.

Min-max theorem

Let <math display="inline">A</math> be Hermitian on an inner product space <math display="inline">V</math> with dimension <math display="inline">n</math>, with spectrum ordered in descending order <math display="inline">\lambda_1 \geq ... \geq \lambda_n</math>.

Let <math display="inline">v_1, ..., v_n</math> be the corresponding unit-length orthogonal eigenvectors.

Reverse the spectrum ordering, so that <math display="inline">\xi_1 = \lambda_n, ..., \xi_n = \lambda_1</math>.

\min _{\begin{array}{c} x \in \mathcal{M} \\ \|x\|=1 \end{array\langle x, A x\rangle\\

&=\min _{\begin{array}{c} \mathcal{M} \subset V \\ \operatorname{dim}(\mathcal{M})=n-k+1 \end{array \max _{\begin{array}{c} x \in \mathcal{M} \\ \|x\|=1 \end{array\langle x, A x\rangle \text{. }

\end{aligned}</math>

Define the partial trace <math display="inline">tr_V(A)</math> to be the trace of projection of <math display="inline">A</math> to <math display="inline">V</math>. It is equal to <math display="inline">\sum_i v_i^*Av_i</math> given an orthonormal basis of <math display="inline">V</math>.

</math>. This still exists. Etc. Now since <math display="inline">dim(E) \leq n-1</math>, apply the induction hypothesis, there exists some <math display="inline">W \in X(W_1, \dots, W_k)</math> such that <math display="block">\lambda_{i_1 - (i_1-1)}(A|E)+\cdots+\lambda_{i_k- (i_1-1)}(A|E) \geq tr_W(A)

</math> Now <math display="inline">\lambda_{i_j - (i_1-1)}(A|E)</math> is the <math display="inline">(i_j-(i_1-1))</math>-th eigenvalue of <math display="inline">A</math> orthogonally projected down to <math display="inline">E</math>. By Cauchy interlacing theorem, <math display="inline">\lambda_{i_j - (i_1-1)}(A|E) \leq \lambda_{i_j}(A)</math>. Since <math display="inline">X(W_1, \dots, W_k)\subset X(V_1, \dots, V_k)</math>, we’re done.

If <math display="inline">i_1 = 1</math>, then we perform a similar construction. Let <math display="inline">E = span(e_{2}, \dots, e_n)</math>. If <math display="inline">V_k \subset E</math>, then we can induct. Otherwise, we construct a partial flag sequence <math display="inline">W_2, \dots, W_k</math> By induction, there exists some <math display="inline">W' \in X(W_2, \dots, W_k)\subset X(V_2, \dots, V_k)</math>, such that <math display="block">\lambda_{i_2-1}(A|E)+\cdots+\lambda_{i_k-1}(A|E) \geq tr_{W'}(A)</math> thus<br />

<math display="block">\lambda_{i_2}(A)+\cdots+\lambda_{i_k}(A) \geq tr_{W'}(A)</math> And it remains to find some <math display="inline">v</math> such that <math display="inline">W' \oplus v \in X(V_1, \dots, V_k)</math>.

If <math display="inline">V_1 \not\subset W'</math>, then any <math display="inline">v \in V_1 \setminus W'</math> would work. Otherwise, if <math display="inline">V_2 \not\subset W'</math>, then any <math display="inline">v \in V_2 \setminus W'</math> would work, and so on. If none of these work, then it means <math display="inline">V_k \subset E</math>, contradiction.

This has some corollaries: Recall the essential spectrum is the spectrum without isolated eigenvalues of finite multiplicity.

Sometimes we have some eigenvalues below the essential spectrum, and we would like to approximate the eigenvalues and eigenfunctions.

:Theorem (Min-Max). Let A be self-adjoint, and let <math>E_1\le E_2\le E_3\le\cdots</math> be the eigenvalues of A below the essential spectrum. Then

<math>E_n=\min_{\psi_1,\ldots,\psi_{n\max\{\langle\psi,A\psi\rangle:\psi\in\operatorname{span}(\psi_1,\ldots,\psi_{n}), \, \| \psi \| = 1\}</math>.

If we only have N eigenvalues and hence run out of eigenvalues, then we let <math>E_n:=\inf\sigma_{ess}(A)</math> (the bottom of the essential spectrum) for n>N, and the above statement holds after replacing min-max with inf-sup.

:Theorem (Max-Min). Let A be self-adjoint, and let <math>E_1\le E_2\le E_3\le\cdots</math> be the eigenvalues of A below the essential spectrum. Then

<math>E_n=\max_{\psi_1,\ldots,\psi_{n-1\min\{\langle\psi,A\psi\rangle:\psi\perp\psi_1,\ldots,\psi_{n-1}, \, \| \psi \| = 1\}</math>.

If we only have N eigenvalues and hence run out of eigenvalues, then we let <math>E_n:=\inf\sigma_{ess}(A)</math> (the bottom of the essential spectrum) for n > N, and the above statement holds after replacing max-min with sup-inf.

The proofs