Riesz representation theorem

The Riesz representation theorem, sometimes called the Riesz–Fréchet representation theorem after Frigyes Riesz and Maurice René Fréchet, establishes an important connection between a Hilbert space and its continuous dual space. If the underlying field is the real numbers, the two are isometrically isomorphic; if the underlying field is the complex numbers, the two are isometrically anti-isomorphic. The (anti-) isomorphism is a particular natural isomorphism.

Preliminaries and notation

Let <math>H</math> be a Hilbert space over a field <math>\mathbb{F},</math> where <math>\mathbb{F}</math> is either the real numbers <math>\R</math> or the complex numbers <math>\Complex.</math> If <math>\mathbb{F} = \Complex</math> (resp. if <math>\mathbb{F} = \R</math>) then <math>H</math> is called a (resp. a ). Every real Hilbert space can be extended to be a dense subset of a unique (up to bijective isometry) complex Hilbert space, called its complexification, which is why Hilbert spaces are often automatically assumed to be complex. Real and complex Hilbert spaces have in common many, but by no means all, properties and results/theorems.

This article is intended for both mathematicians and physicists and will describe the theorem for both.

In both mathematics and physics, if a Hilbert space is assumed to be real (that is, if <math>\mathbb{F} = \R</math>) then this will usually be made clear. Often in mathematics, and especially in physics, unless indicated otherwise, "Hilbert space" is usually automatically assumed to mean "complex Hilbert space." Depending on the author, in mathematics, "Hilbert space" usually means either (1) a complex Hilbert space, or (2) a real complex Hilbert space.

Linear and antilinear maps

By definition, an Antilinear map| (also called a ) <math>f : H \to Y</math> is a map between vector spaces that is :

and (also called or ):

<math display="block">f(c x) = \overline{c} f(x) \quad \text{ for all } x \in H \text{ and all scalar } c \in \mathbb{F},</math>

where <math>\overline{c}</math> is the conjugate of the complex number <math>c = a + b i</math>, given by <math>\overline{c} = a - b i</math>.

In contrast, a map <math>f : H \to Y</math> is linear if it is additive and Homogeneous function|:

<math display=block>f(c x) = c f(x) \quad \text{ for all } x \in H \quad \text{ and all scalars } c \in \mathbb{F}.</math>

Every constant <math>0</math> map is always both linear and antilinear. If <math>\mathbb{F} = \R</math> then the definitions of linear maps and antilinear maps are completely identical. A linear map from a Hilbert space into a Banach space (or more generally, from any Banach space into any topological vector space) is continuous if and only if it is bounded; the same is true of antilinear maps. The inverse of any antilinear (resp. linear) bijection is again an antilinear (resp. linear) bijection. The composition of two linear maps is a map.

Continuous dual and anti-dual spaces

A on <math>H</math> is a function <math>H \to \mathbb{F}</math> whose codomain is the underlying scalar field <math>\mathbb{F}.</math>

Denote by <math>H^*</math> (resp. by <math>\overline{H}^*)</math> the set of all continuous linear (resp. continuous antilinear) functionals on <math>H,</math> which is called the (resp. the ) of <math>H.</math>

If <math>\mathbb{F} = \R</math> then linear functionals on <math>H</math> are the same as antilinear functionals and consequently, the same is true for such continuous maps: that is, <math>H^* = \overline{H}^*.</math>

One-to-one correspondence between linear and antilinear functionals

Given any functional <math>f ~:~ H \to \mathbb{F},</math> the is the functional

<math display=block>\begin{alignat}{4}

\overline{f} : \,& H && \to \,&& \mathbb{F} \\

& h && \mapsto\,&& \overline{f(h)}. \\

\end{alignat}</math>

This assignment is most useful when <math>\mathbb{F} = \Complex</math> because if <math>\mathbb{F} = \R</math> then <math>f = \overline{f}</math> and the assignment <math>f \mapsto \overline{f}</math> reduces down to the identity map.

The assignment <math>f \mapsto \overline{f}</math> defines an antilinear bijective correspondence from the set of

:all functionals (resp. all linear functionals, all continuous linear functionals <math>H^*</math>) on <math>H,</math>

onto the set of

:all functionals (resp. all linear functionals, all continuous linear functionals <math>\overline{H}^*</math>) on <math>H.</math>

Mathematics vs. physics notations and definitions of inner product

The Hilbert space <math>H</math> has an associated inner product <math>H \times H \to \mathbb{F}</math> valued in <math>H</math>'s underlying scalar field <math>\mathbb{F}</math> that is linear in one coordinate and antilinear in the other (as specified below).

If <math>H</math> is a complex Hilbert space (<math>\mathbb{F} = \Complex</math>), then there is a crucial difference between the notations prevailing in mathematics versus physics, regarding which of the two variables is linear.

However, for real Hilbert spaces (<math>\mathbb{F} = \R</math>), the inner product is a symmetric map that is linear in each coordinate (bilinear), so there can be no such confusion.

In mathematics, the inner product on a Hilbert space <math>H</math> is often denoted by <math>\left\langle \cdot\,, \cdot \right\rangle</math> or <math>\left\langle \cdot\,, \cdot \right\rangle_H</math> while in physics, the bra–ket notation <math>\left\langle \cdot \mid \cdot \right\rangle</math> or <math>\left\langle \cdot \mid \cdot \right\rangle_H</math> is typically used. In this article, these two notations will be related by the equality:

<math display="block">\left\langle x, y \right\rangle := \left\langle y \mid x \right\rangle \quad \text{ for all } x, y \in H.</math>These have the following properties:<ol>

<li>The map <math>\left\langle \cdot\,, \cdot \right\rangle</math> is linear in its first coordinate; equivalently, the map <math>\left\langle \cdot \mid \cdot \right\rangle</math> is linear in its second coordinate. That is, for fixed <math>y \in H,</math> the map

<math>\left\langle \,y\mid \cdot\, \right\rangle = \left\langle \,\cdot\,, y\, \right\rangle : H \to \mathbb{F}</math> with <math display="inline">h \mapsto \left\langle \,y\mid h\, \right\rangle = \left\langle \,h, y\, \right\rangle </math>

is a linear functional on <math>H.</math> This linear functional is continuous, so <math>\left\langle \,y\mid\cdot\, \right\rangle = \left\langle \,\cdot, y\, \right\rangle \in H^*.</math>

</li>

<li>The map <math>\left\langle \cdot\,, \cdot \right\rangle</math> is antilinear in its coordinate; equivalently, the map <math>\left\langle \cdot \mid \cdot \right\rangle</math> is antilinear in its coordinate. That is, for fixed <math>y \in H,</math> the map

<math>\left\langle \,\cdot\mid y\, \right\rangle = \left\langle \,y, \cdot\, \right\rangle : H \to \mathbb{F}</math> with <math display="inline">h \mapsto \left\langle \,h\mid y\, \right\rangle = \left\langle \,y, h\, \right\rangle </math>

is an antilinear functional on <math>H.</math> This antilinear functional is continuous, so <math>\left\langle \,\cdot\mid y\, \right\rangle = \left\langle \,y, \cdot\, \right\rangle \in \overline{H}^*.</math>

</li>

</ol>

In computations, one must consistently use either the mathematics notation <math>\left\langle \cdot\,, \cdot \right\rangle</math>, which is (linear, antilinear); or the physics notation <math>\left\langle \cdot \mid \cdot \right\rangle</math>, which is (antilinear | linear).

Canonical norm and inner product on the dual space and anti-dual space

If <math>x = y</math> then <math>\langle \,x\mid x\, \rangle = \langle \,x, x\, \rangle</math> is a non-negative real number and the map

<math display=block>\|x\| := \sqrt{\langle x, x \rangle} = \sqrt{\langle x \mid x \rangle}</math>

defines a canonical norm on <math>H</math> that makes <math>H</math> into a normed space.

As with all normed spaces, the (continuous) dual space <math>H^*</math> carries a canonical norm, called the , that is defined by

<math display=block>\|f\|_{H^*} ~:=~ \sup_{\|x\| \leq 1, x \in H} |f(x)| \quad \text{ for every } f \in H^*.</math>

The canonical norm on the (continuous) anti-dual space <math>\overline{H}^*,</math> denoted by <math>\|f\|_{\overline{H}^*},</math> is defined by using this same equation:

<math display=block>\|f\|_{\overline{H}^*} ~:=~ \sup_{\|x\| \leq 1, x \in H} |f(x)| \quad \text{ for every } f \in \overline{H}^*.</math>

This canonical norm on <math>H^*</math> satisfies the parallelogram law, which means that the polarization identity can be used to define a which this article will denote by the notations

<math display=block>\left\langle f, g \right\rangle_{H^*} := \left\langle g \mid f \right\rangle_{H^*},</math>

where this inner product turns <math>H^*</math> into a Hilbert space. There are now two ways of defining a norm on <math>H^*:</math> the norm induced by this inner product (that is, the norm defined by <math>f \mapsto \sqrt{\left\langle f, f \right\rangle_{H^*</math>) and the usual dual norm (defined as the supremum over the closed unit ball). These norms are the same; explicitly, this means that the following holds for every <math>f \in H^*:</math>

<math display="block">\sup_{\|x\| \leq 1, x \in H} |f(x)| = \|f\|_{H^*} ~=~ \sqrt{\langle f, f \rangle_{H^* ~=~ \sqrt{\langle f \mid f \rangle_{H^*.</math>

As will be described later, the Riesz representation theorem can be used to give an equivalent definition of the canonical norm and the canonical inner product on <math>H^*.</math>

The same equations that were used above can also be used to define a norm and inner product on <math>H</math>'s anti-dual space <math>\overline{H}^*.</math>

Canonical isometry between the dual and antidual

The complex conjugate <math>\overline{f}</math> of a functional <math>f,</math> which was defined above, satisfies

<math display=block>\|f\|_{H^*} ~=~ \left\|\overline{f}\right\|_{\overline{H}^*} \quad \text{ and } \quad \left\|\overline{g}\right\|_{H^*} ~=~ \|g\|_{\overline{H}^*}</math>

for every <math>f \in H^*</math> and every <math>g \in \overline{H}^*.</math>

This says exactly that the canonical antilinear bijection defined by

<math display=block>\begin{alignat}{4}

\operatorname{Cong} :\;&& H^* &&\;\to \;& \overline{H}^* \\[0.3ex]

&& f &&\;\mapsto\;& \overline{f} \\

\end{alignat}</math>

as well as its inverse <math>\operatorname{Cong}^{-1} ~:~ \overline{H}^* \to H^*</math> are antilinear isometries and consequently also homeomorphisms.

The inner products on the dual space <math>H^*</math> and the anti-dual space <math>\overline{H}^*,</math> denoted respectively by <math>\langle \,\cdot\,, \,\cdot\, \rangle_{H^*}</math> and <math>\langle \,\cdot\,, \,\cdot\, \rangle_{\overline{H}^*},</math> are related by

<math display=block>\langle \,\overline{f}\, | \,\overline{g}\, \rangle_{\overline{H}^*} = \overline{\langle \,f\, | \,g\, \rangle_{H^* = \langle \,g\, | \,f\, \rangle_{H^*} \qquad \text{ for all } f, g \in H^*</math>

and

<math display=block>\langle \,\overline{f}\, | \,\overline{g}\, \rangle_{H^*} = \overline{\langle \,f\, | \,g\, \rangle_{\overline{H}^* = \langle \,g\, | \,f\, \rangle_{\overline{H}^*} \qquad \text{ for all } f, g \in \overline{H}^*.</math>

If <math>\mathbb{F} = \R</math> then <math>H^* = \overline{H}^*</math> and this canonical map <math>\operatorname{Cong} : H^* \to \overline{H}^*</math> reduces down to the identity map.

Riesz representation theorem

Two vectors <math>x</math> and <math>y</math> are if <math>\langle x, y \rangle = 0,</math> which happens if and only if <math>\|y\| \leq \|y + s x\|</math> for all scalars <math>s.</math> The orthogonal complement of a subset <math>X \subseteq H</math> is

<math display=block>X^{\bot} := \{ \,y \in H : \langle y, x \rangle = 0 \text{ for all } x \in X\, \},</math>

which is always a closed vector subspace of <math>H.</math>

The Hilbert projection theorem guarantees that for any nonempty closed convex subset <math>C</math> of a Hilbert space there exists a unique vector <math>m \in C</math> such that <math>\|m\| = \inf_{c \in C} \|c\|;</math> that is, <math>m \in C</math> is the (unique) global minimum point of the function <math>C \to [0, \infty)</math> defined by <math>c \mapsto \|c\|.</math>

Statement

{\|p\|^2} p\, \Bigg| \,h\, \right\rangle \quad \text{ for every } h \in H,</math>

which proves that the vector <math>f_{\varphi} := \frac{\overline{\varphi p{\|p\|^2} p</math> satisfies

<math>\varphi h = \langle \,f_{\varphi}\, | \,h\, \rangle \text{ for every } h \in H.</math>

Applying the norm formula that was proved above with <math>y := f_{\varphi}</math> shows that <math>\|\varphi\|_{H^*} = \left\|\left\langle \,f_{\varphi}\, | \,\cdot\, \right\rangle\right\|_{H^*} = \left\|f_{\varphi}\right\|_H.</math>

Also, the vector <math>u := \frac{p}{\|p\|}</math> has norm <math>\|u\| = 1</math> and satisfies <math>f_{\varphi} := \overline{\varphi(u)} u.</math>

<math>\blacksquare</math>

It can now be deduced that <math>K^{\bot}</math> is <math>1</math>-dimensional when <math>\varphi \neq 0.</math>

Let <math>q \in K^{\bot}</math> be any non-zero vector. Replacing <math>p</math> with <math>q</math> in the proof above shows that the vector <math>g := \frac{\overline{\varphi q{\|q\|^2} q</math> satisfies <math>\varphi(h) = \langle \,g\, | \,h\, \rangle</math> for every <math>h \in H.</math> The uniqueness of the (non-zero) vector <math>f_{\varphi}</math> representing <math>\varphi</math> implies that <math>f_{\varphi} = g,</math> which in turn implies that <math>\overline{\varphi q} \neq 0</math> and <math>q = \frac{\|q\|^2}{\overline{\varphi q f_{\varphi}.</math> Thus every vector in <math>K^{\bot}</math> is a scalar multiple of <math>f_{\varphi}.</math> <math>\blacksquare</math>

The formulas for the inner products follow from the polarization identity.

Observations

If <math>\varphi \in H^*</math> then

<math display=block>\varphi \left(f_{\varphi}\right) = \left\langle f_{\varphi}, f_{\varphi} \right\rangle = \left\|f_{\varphi}\right\|^2 = \|\varphi\|^2.</math>

So in particular, <math>\varphi \left(f_{\varphi}\right) \geq 0</math> is always real and furthermore, <math>\varphi \left(f_{\varphi}\right) = 0</math> if and only if <math>f_{\varphi} = 0</math> if and only if <math>\varphi = 0.</math>

Linear functionals as affine hyperplanes

A non-trivial continuous linear functional <math>\varphi</math> is often interpreted geometrically by identifying it with the affine hyperplane <math>A := \varphi^{-1}(1)</math> (the kernel <math>\ker\varphi = \varphi^{-1}(0)</math> is also often visualized alongside <math>A := \varphi^{-1}(1)</math> although knowing <math>A</math> is enough to reconstruct <math>\ker \varphi</math> because if <math>A = \varnothing</math> then <math>\ker \varphi = H</math> and otherwise <math>\ker \varphi = A - A</math>). In particular, the norm of <math>\varphi</math> should somehow be interpretable as the "norm of the hyperplane <math>A</math>". When <math>\varphi \neq 0</math> then the Riesz representation theorem provides such an interpretation of <math>\|\varphi\|</math> in terms of the affine hyperplane

Bra of a linear functional

Given a continuous linear functional <math>\psi \in H^*,</math> let <math>\langle \psi\mid</math> denote the vector <math>\Phi^{-1} \psi \in H</math>; that is,

<math display=block>\langle \psi\mid ~:=~ \Phi^{-1} \psi.</math>

The assignment <math>\psi \mapsto \langle \psi\mid</math> is just the isometric antilinear isomorphism <math>\Phi^{-1} ~:~ H^* \to H,</math> which is why <math>~\langle c \psi + \phi\mid ~=~ \overline{c} \langle \psi\mid ~+~ \langle \phi\mid~</math> holds for all <math>\phi, \psi \in H^*</math> and all scalars <math>c.</math>

The defining condition of the vector <math>\langle \psi | \in H</math> is the technically correct but unsightly equality

<math display=block>\left\langle \, \langle \psi\mid \, \mid g \right\rangle_H ~=~ \psi g \quad \text{ for all } g \in H,</math>

which is why the notation <math>\left\langle \psi \mid g \right\rangle</math> is used in place of <math>\left\langle \, \langle \psi\mid \, \mid g \right\rangle_H = \left\langle g, \, \langle \psi\mid \right\rangle_H.</math> With this notation, the defining condition becomes

<math display=block>\left\langle \psi\mid g \right\rangle ~=~ \psi g \quad \text{ for all } g \in H.</math>

Kets

For any given vector <math>g \in H,</math> the notation <math>| \,g \rangle</math> is used to denote <math>g</math>; that is,

<math display=block>\mid g \rangle : = g.</math>

The assignment <math>g \mapsto | \,g \rangle</math> is just the identity map <math>\operatorname{Id}_H : H \to H,</math> which is why <math>~\mid c g + h \rangle ~=~ c \mid g \rangle ~+~ \mid h \rangle~</math> holds for all <math>g, h \in H</math> and all scalars <math>c.</math>

The notation <math>\langle h\mid g \rangle</math> and <math>\langle \psi\mid g \rangle</math> is used in place of <math>\left\langle h\mid \, \mid g \rangle \, \right\rangle_H ~=~ \left\langle \mid g \rangle, h \right\rangle_H</math> and <math>\left\langle \psi\mid \, \mid g \rangle \, \right\rangle_H ~=~ \left\langle g, \, \langle \psi\mid \right\rangle_H,</math> respectively. As expected, <math>~\langle \psi\mid g \rangle = \psi g~</math> and <math>~\langle h\mid g \rangle~</math> really is just the scalar <math>~\langle h\mid g \rangle_H ~=~ \langle g, h \rangle_H.</math>

Adjoints and transposes

Let <math>A : H \to Z</math> be a continuous linear operator between Hilbert spaces <math>\left(H, \langle \cdot, \cdot \rangle_H\right)</math> and <math>\left(Z, \langle \cdot, \cdot \rangle_Z \right).</math> As before, let <math>\langle y \mid x \rangle_H := \langle x, y \rangle_H</math> and <math>\langle y \mid x \rangle_Z := \langle x, y \rangle_Z.</math>

Denote by

<math display=block>\begin{alignat}{4}

\Phi_H :\;&& H &&\;\to \;& H^* \\[0.3ex]

&& g &&\;\mapsto\;& \langle \,g \mid \cdot\, \rangle_H \\

\end{alignat}

\quad \text{ and } \quad

\begin{alignat}{4}

\Phi_Z :\;&& Z &&\;\to \;& Z^* \\[0.3ex]

&& y &&\;\mapsto\;& \langle \,y \mid \cdot\, \rangle_Z \\

\end{alignat}</math>

the usual bijective antilinear isometries that satisfy:

<math display=block>\left(\Phi_H g\right) h = \langle g\mid h \rangle_H \quad \text{ for all } g, h \in H \qquad \text{ and } \qquad \left(\Phi_Z y\right) z = \langle y \mid z \rangle_Z \quad \text{ for all } y, z \in Z.</math>

Definition of the adjoint

For every <math>z \in Z,</math> the scalar-valued map <math>\langle z\mid A (\cdot) \rangle_Z</math>

Proofs

Bibliography

P. Halmos Measure Theory, D. van Nostrand and Co., 1950.
P. Halmos, A Hilbert Space Problem Book, Springer, New York 1982 (problem 3 contains version for vector spaces with coordinate systems).

Walter Rudin, Real and Complex Analysis, McGraw-Hill, 1966, .