Hahn–Banach theorem

In functional analysis, the Hahn–Banach theorem is a central result that allows the extension of bounded linear functionals defined on a vector subspace of some vector space to the whole space. The theorem also shows that there are sufficient continuous linear functionals defined on every normed vector space in order to study the dual space. Another version of the Hahn–Banach theorem is known as the Hahn–Banach separation theorem or the hyperplane separation theorem, and has numerous uses in convex geometry.

History

The theorem is named for the mathematicians Hans Hahn and Stefan Banach, who proved it independently in the late 1920s.

The special case of the theorem for the space <math>C[a, b]</math> of continuous functions on an interval was proved earlier (in 1912) by Eduard Helly, and a more general extension theorem, the M. Riesz extension theorem, from which the Hahn–Banach theorem can be derived, was proved in 1923 by Marcel Riesz.

The first Hahn–Banach theorem was proved by Eduard Helly in 1912 who showed that certain linear functionals defined on a subspace of a certain type of normed space (<math>\Complex^{\N}</math>) had an extension of the same norm. Helly did this through the technique of first proving that a one-dimensional extension exists (where the linear functional has its domain extended by one dimension) and then using induction. In 1927, Hahn defined general Banach spaces and used Helly's technique to prove a norm-preserving version of Hahn–Banach theorem for Banach spaces (where a bounded linear functional on a subspace has a bounded linear extension of the same norm to the whole space). In 1929, Banach, who was unaware of Hahn's result, generalized it by replacing the norm-preserving version with the dominated extension version that uses sublinear functions. Whereas Helly's proof used mathematical induction, Hahn and Banach both used transfinite induction.

The Hahn–Banach theorem arose from attempts to solve infinite systems of linear equations. This is needed to solve problems such as the moment problem, whereby given all the potential moments of a function one must determine if a function having these moments exists, and, if so, find it in terms of those moments. Another such problem is the Fourier cosine series problem, whereby given all the potential Fourier cosine coefficients one must determine if a function having those coefficients exists, and, again, find it if so.

Riesz and Helly solved the problem for certain classes of spaces (such as [[Lp space|<math>L^p([0, 1])</math>]] and [[Continuous functions on a compact Hausdorff space|<math>C([a, b])</math>]]) where they discovered that the existence of a solution was equivalent to the existence and continuity of certain linear functionals. In effect, they needed to solve the following problem:

:() Given a collection <math>\left(f_i\right)_{i \in I}</math> of bounded linear functionals on a normed space <math>X</math> and a collection of scalars <math>\left(c_i\right)_{i \in I},</math> determine if there is an <math>x \in X</math> such that <math>f_i(x) = c_i</math> for all <math>i \in I.</math>

If <math>X</math> happens to be a reflexive space then to solve the vector problem, it suffices to solve the following dual problem:

:(The functional problem) Given a collection <math>\left(x_i\right)_{i \in I}</math> of vectors in a normed space <math>X</math> and a collection of scalars <math>\left(c_i\right)_{i \in I},</math> determine if there is a bounded linear functional <math>f</math> on <math>X</math> such that <math>f\left(x_i\right) = c_i</math> for all <math>i \in I.</math>

Riesz went on to define [[Lp space|<math>L^p([0, 1])</math> space]] (<math>1 < p < \infty</math>) in 1910 and the <math>\ell^p</math> spaces in 1913. While investigating these spaces he proved a special case of the Hahn–Banach theorem. Helly also proved a special case of the Hahn–Banach theorem in 1912. In 1910, Riesz solved the functional problem for some specific spaces and in 1912, Helly solved it for a more general class of spaces. It wasn't until 1932 that Banach, in one of the first important applications of the Hahn–Banach theorem, solved the general functional problem. The following theorem states the general functional problem and characterizes its solution.

The Hahn–Banach theorem can be deduced from the above theorem. If <math>X</math> is reflexive then this theorem solves the vector problem.

Hahn–Banach theorem

A real-valued function <math>f : M \to \R</math> defined on a subset <math>M</math> of <math>X</math> is said to be a function <math>p : X \to \R</math> if <math>f(m) \leq p(m)</math> for every <math>m \in M.</math>

For this reason, the following version of the Hahn–Banach theorem is called .

The theorem remains true if the requirements on <math>p</math> are relaxed to require only that <math>p</math> be a convex function:

<math display=block>p(t x + (1 - t) y) \leq t p(x) + (1 - t) p(y) \qquad \text{ for all } 0 < t < 1 \text{ and } x, y \in X.</math>

A function <math>p : X \to \R</math> is convex and satisfies <math>p(0) \leq 0</math> if and only if <math>p(a x + b y) \leq a p(x) + b p(y)</math> for all vectors <math>x, y \in X</math> and all non-negative real <math>a, b \geq 0</math> such that <math>a + b \leq 1.</math> Every sublinear function is a convex function.

On the other hand, if <math>p : X \to \R</math> is convex with <math>p(0) \geq 0,</math> then the function defined by <math>p_0(x) \;\stackrel{\scriptscriptstyle\text{def{=}\; \inf_{t > 0} \frac{p(tx)}{t}</math> is positively homogeneous

(because for all <math>x</math> and <math>r>0</math> one has <math>p_0(rx)=\inf_{t > 0} \frac{p(trx)}{t} =r\inf_{t > 0} \frac{p(trx)}{tr} = r\inf_{\tau > 0} \frac{p(\tau x)}{\tau}=rp_0(x)</math>), hence, being convex, it is sublinear. It is also bounded above by <math>p_0 \leq p,</math> and satisfies <math>F \leq p_0</math> for every linear functional <math>F \leq p.</math> So the extension of the Hahn–Banach theorem to convex functionals does not have a much larger content than the classical one stated for sublinear functionals.

If <math>F : X \to \R</math> is linear then <math>F \leq p</math> if and only if <math display=block>-p(-x) \leq F(x) \leq p(x) \quad \text{ for all } x \in X,</math>

which is the (equivalent) conclusion that some authors write instead of <math>F \leq p.</math>

It follows that if <math>p : X \to \R</math> is also , meaning that <math>p(-x) = p(x)</math> holds for all <math>x \in X,</math> then <math>F \leq p</math> if and only <math>|F| \leq p.</math>

Every norm is a seminorm and both are symmetric balanced sublinear functions. A sublinear function is a seminorm if and only if it is a balanced function. On a real vector space (although not on a complex vector space), a sublinear function is a seminorm if and only if it is symmetric. The identity function <math>\R \to \R</math> on <math>X := \R</math> is an example of a sublinear function that is not a seminorm.

For complex or real vector spaces

The dominated extension theorem for real linear functionals implies the following alternative statement of the Hahn–Banach theorem that can be applied to linear functionals on real or complex vector spaces.

The theorem remains true if the requirements on <math>p</math> are relaxed to require only that for all <math>x, y \in X</math> and all scalars <math>a</math> and <math>b</math> satisfying <math>|a| + |b| \leq 1,</math>

This condition holds if and only if <math>p</math> is a convex and balanced function satisfying <math>p(0) \leq 0,</math> or equivalently, if and only if it is convex, satisfies <math>p(0) \leq 0,</math> and <math>p(u x) \leq p(x)</math> for all <math>x \in X</math> and all unit length scalars <math>u.</math>

A complex-valued functional <math>F</math> is said to be if <math>|F(x)| \leq p(x)</math> for all <math>x</math> in the domain of <math>F.</math>

With this terminology, the above statements of the Hahn–Banach theorem can be restated more succinctly:

:Hahn–Banach dominated extension theorem: If <math>p : X \to \R</math> is a seminorm defined on a real or complex vector space <math>X,</math> then every dominated linear functional defined on a vector subspace of <math>X</math> has a dominated linear extension to all of <math>X.</math> In the case where <math>X</math> is a real vector space and <math>p : X \to \R</math> is merely a convex or sublinear function, this conclusion will remain true if both instances of "dominated" (meaning <math>|F| \leq p</math>) are weakened to instead mean "dominated " (meaning <math>F \leq p</math>).

Proof

The following observations allow the Hahn–Banach theorem for real vector spaces to be applied to (complex-valued) linear functionals on complex vector spaces.

Every linear functional <math>F : X \to \Complex</math> on a complex vector space is completely determined by its real part <math>\; \operatorname{Re} F : X \to \R \;</math> through the formula

<math display=block>F(x) \;=\; \operatorname{Re} F(x) - i \operatorname{Re} F(i x) \qquad \text{ for all } x \in X</math>

and moreover, if <math>\|\cdot\|</math> is a norm on <math>X</math> then their dual norms are equal: <math>\|F\| = \|\operatorname{Re} F\|.</math>

In particular, a linear functional on <math>X</math> extends another one defined on <math>M \subseteq X</math> if and only if their real parts are equal on <math>M</math> (in other words, a linear functional <math>F</math> extends <math>f</math> if and only if <math>\operatorname{Re} F</math> extends <math>\operatorname{Re} f</math>).

The real part of a linear functional on <math>X</math> is always a (meaning that it is linear when <math>X</math> is considered as a real vector space) and if <math>R : X \to \R</math> is a real-linear functional on a complex vector space then <math>x \mapsto R(x) - i R(i x)</math> defines the unique linear functional on <math>X</math> whose real part is <math>R.</math>

If <math>F</math> is a linear functional on a (complex or real) vector space <math>X</math> and if <math>p : X \to \R</math> is a seminorm then

<math display=block>|F| \,\leq\, p \quad \text{ if and only if } \quad \operatorname{Re} F \,\leq\, p.</math>

Stated in simpler language, a linear functional is dominated by a seminorm <math>p</math> if and only if its real part is dominated above by <math>p.</math>

The proof above shows that when <math>p</math> is a seminorm then there is a one-to-one correspondence between dominated linear extensions of <math>f : M \to \Complex</math> and dominated real-linear extensions of <math>\operatorname{Re} f : M \to \R;</math> the proof even gives a formula for explicitly constructing a linear extension of <math>f</math> from any given real-linear extension of its real part.

Continuity

A linear functional <math>F</math> on a topological vector space is continuous if and only if this is true of its real part <math>\operatorname{Re} F;</math> if the domain is a normed space then <math>\|F\| = \|\operatorname{Re} F\|</math> (where one side is infinite if and only if the other side is infinite).

Assume <math>X</math> is a topological vector space and <math>p : X \to \R</math> is sublinear function.

If <math>p</math> is a continuous sublinear function that dominates a linear functional <math>F</math> then <math>F</math> is necessarily continuous. Moreover, a linear functional <math>F</math> is continuous if and only if its absolute value <math>|F|</math> (which is a seminorm that dominates <math>F</math>) is continuous. In particular, a linear functional is continuous if and only if it is dominated by some continuous sublinear function.

Proof

The Hahn–Banach theorem for real vector spaces ultimately follows from Helly's initial result for the special case where the linear functional is extended from <math>M</math> to a larger vector space in which <math>M</math> has codimension <math>1.</math>