Polarization identity

thumb|Vectors involved in the polarization identity <math>2\|x\|^2 + 2\|y\|^2 = \|x+y\|^2 + \|x-y\|^2.</math>

In linear algebra, the polarization identity is any one of a family of formulas that express the inner product of two vectors in terms of the norm of a normed vector space.

If a norm arises from an inner product then the polarization identity can be used to express this inner product entirely in terms of the norm. The polarization identity shows that a norm can arise from at most one inner product; however, there exist norms that do not arise from any inner product.

The norm associated with any inner product space satisfies the parallelogram law: <math>\|x+y\|^2 + \|x-y\|^2 = 2\|x\|^2 + 2\|y\|^2.</math>

In fact, as observed by John von Neumann, the parallelogram law characterizes those norms that arise from inner products.

Given a normed space <math>(H, \|\cdot\|)</math>, the parallelogram law holds for <math>\|\cdot\|</math> if and only if there exists an inner product <math>\langle \cdot, \cdot \rangle</math> on <math>H</math> such that <math>\|x\|^2 = \langle x,\ x\rangle</math> for all <math>x \in H,</math> in which case this inner product is uniquely determined by the norm via the polarization identity.

Polarization identities

Any inner product on a vector space induces a norm by the equation

<math display=block>\|x\| = \sqrt{\langle x, x \rangle}.</math>

The polarization identities reverse this relationship, recovering the inner product from the norm.

Every inner product satisfies:

<math display=block>\|x + y\|^2 = \|x\|^2 + \|y\|^2 + 2\operatorname{Re}\langle x, y \rangle \qquad \text{ for all vectors } x, y.</math>

Solving for <math>\operatorname{Re}\langle x, y \rangle</math> gives the formula <math>\operatorname{Re}\langle x, y \rangle = \frac{1}{2} \left(\|x+y\|^2 - \|x\|^2 - \|y\|^2\right).</math> If the inner product is real then <math>\operatorname{Re}\langle x, y \rangle = \langle x, y \rangle</math> and this formula becomes a polarization identity for real inner products.

Real vector spaces

If the vector space is over the real numbers then the polarization identities are:

<math display="block">\begin{alignat}{4}

\langle x, y \rangle

&= \frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2\right) \\[3pt]

&= \frac{1}{2} \left(\|x+y\|^2 - \|x\|^2 - \|y\|^2\right) \\[3pt]

&= \frac{1}{2} \left(\|x\|^2 + \|y\|^2 - \|x-y\|^2\right). \\[3pt]

\end{alignat}</math>

These various forms are all equivalent by the parallelogram law:

<math display=block>\langle x, y \rangle = \frac{1}{4} \sum_{k=0}^3 i^k \left\|x + i^k y\right\|^2.</math>

Summary of both cases

Thus if <math>R(x, y) + i I(x, y)</math> denotes the real and imaginary parts of some inner product's value at the point <math>(x, y) \in H \times H</math> of its domain, then its imaginary part will be:

<math display=block>I(x, y) ~=~

\begin{cases}

~R({\color{red}i} x, y) & \qquad \text{ if antilinear in the } {\color{red}1} \text{st argument} \\

~R(x, {\color{blue}i} y) & \qquad \text{ if antilinear in the } {\color{blue}2} \text{nd argument} \\

\end{cases}</math>

where the scalar <math>i</math> is always located in the same argument that the inner product is antilinear in.

Using , the above formula for the imaginary part becomes:

<math display=block>I(x, y) ~=~

\begin{cases}

-R(x, {\color{black}i} y) & \qquad \text{ if antilinear in the } {\color{black}1} \text{st argument} \\

-R({\color{black}i} x, y) & \qquad \text{ if antilinear in the } {\color{black}2} \text{nd argument} \\

\end{cases}</math>

Reconstructing the inner product

In a normed space <math>(H, \|\cdot\|),</math> if the parallelogram law

holds, then there exists a unique inner product <math>\langle \cdot,\ \cdot\rangle</math> on <math>H</math> such that <math>\|x\|^2 = \langle x,\ x\rangle</math> for all <math>x \in H.</math>

Another necessary and sufficient condition for there to exist an inner product that induces a given norm <math>\|\cdot\|</math> is for the norm to satisfy Ptolemy's inequality, which is:

<math display=block>\|x - y\| \, \|z\| ~+~ \|y - z\| \, \|x\| ~\geq~ \|x - z\| \, \|y\| \qquad \text{ for all vectors } x, y, z.</math>

Applications and consequences

If <math>H</math> is a complex Hilbert space then <math>\langle x \mid y \rangle</math> is real if and only if its imaginary part is , which happens if and only if .

Similarly, <math>\langle x \mid y \rangle</math> is (purely) imaginary if and only if .

For example, from <math>\|x+ix\| = |1+i| \|x\| = \sqrt{2} \|x\| = |1-i| \|x\| = \|x-ix\|</math> it can be concluded that <math>\langle x | x \rangle</math> is real and that <math>\langle x | ix \rangle</math> is purely imaginary.

Isometries

If <math>A : H \to Z</math> is a linear isometry between two Hilbert spaces (so <math>\|A h\| = \|h\|</math> for all <math>h \in H</math>) then

<math display=block>\langle A h, A k \rangle_Z = \langle h, k \rangle_H \quad \text{ for all } h, k \in H;</math>

that is, linear isometries preserve inner products.

If <math>A : H \to Z</math> is instead an antilinear isometry then

<math display=block>\langle A h, A k \rangle_Z = \overline{\langle h, k \rangle_H} = \langle k, h \rangle_H \quad \text{ for all } h, k \in H.</math>

Relation to the law of cosines

The second form of the polarization identity can be written as

<math display=block>\|\textbf{u}-\textbf{v}\|^2 = \|\textbf{u}\|^2 + \|\textbf{v}\|^2 - 2(\textbf{u} \cdot \textbf{v}).</math>

This is essentially a vector form of the law of cosines for the triangle formed by the vectors , , and .

In particular,

<math display=block>\textbf{u}\cdot\textbf{v} = \|\textbf{u}\|\,\|\textbf{v}\| \cos\theta,</math>

where <math>\theta</math> is the angle between the vectors <math>\textbf{u}</math> and .

The equation is numerically unstable if u and v are similar because of catastrophic cancellation and should be avoided for numeric computation.

Derivation

The basic relation between the norm and the dot product is given by the equation

<math display=block>\|\textbf{v}\|^2 = \textbf{v} \cdot \textbf{v}.</math>

Then

<math display=block>\begin{align}

\|\textbf{u} + \textbf{v}\|^2

&= (\textbf{u} + \textbf{v}) \cdot (\textbf{u} + \textbf{v}) \\[3pt]

&= (\textbf{u} \cdot \textbf{u}) + (\textbf{u} \cdot \textbf{v}) + (\textbf{v} \cdot \textbf{u}) + (\textbf{v} \cdot \textbf{v}) \\[3pt]

&= \|\textbf{u}\|^2 + \|\textbf{v}\|^2 + 2(\textbf{u} \cdot \textbf{v}),

\end{align}</math>

and similarly

<math display=block>\|\textbf{u} - \textbf{v}\|^2 = \|\textbf{u}\|^2 + \|\textbf{v}\|^2 - 2(\textbf{u} \cdot \textbf{v}).</math>

Forms (1) and (2) of the polarization identity now follow by solving these equations for , while form (3) follows from subtracting these two equations.

(Adding these two equations together gives the parallelogram law.)

Generalizations

Jordan–von Neumann theorems

The standard Jordan–von Neumann theorem, as stated previously, is that the if a norm satisfies the parallelogram law, then it can be induced by an inner product defined by the polarization identity. There are variants of the theorem.

Define various senses of orthogonality:

isosceles: <math display="inline">\|x+y \| =\|x-y \|</math>
Roberts’: <math display="inline">\left\|x+ty\right\|=\left\|x-ty\right\|</math> for all scalar <math display="inline">t</math>.
Pythagorean: <math display="inline">\left\|x+y\right\|^2=\|x\|^2+\left\|y\right\|^2</math>
Birkhoff–James: <math display="inline">\|x\| \leq \|x + ty \|</math> for all scalar <math display="inline">t</math>.

Let <math display="inline">V</math> be a vector space over the real or complex numbers. Let <math display="inline">\|\cdot\|</math> be a norm over <math display="inline">V</math>. We consider conditions for which the norm is induced by an inner product. In the following statements, whenever a scalar appears, the scalar may be restricted to be merely real, even when <math display="inline">V</math> is over the complex numbers.

(von Neumann–Jordan condition) The norm satisfies the parallelogram identity.
(weakened von Neumann–Jordan condition) <math display="inline">\|x + y\|^2 + \|x - y\|^2 = 4</math> for all unit vectors <math display="inline">x,y</math>. That is, the norm satisfies the parallelogram identity for unit vectors.
For any <math display="inline">x, y \in V</math>, the set of points equidistant to <math display="inline">x, y</math> is flat, that is, an affine subspace.
Orthogonality in either isosceles or Roberts’ sense is either additive or homogeneous on one variable.
For every two-dimensional subspace <math display="inline">W \subset V</math>, for every <math display="inline">x \in W</math>, there exists <math display="inline">y \in W</math> that is Roberts’ orthogonal to <math display="inline">x</math>.
Isosceles orthogonality implies Pythagorean orthogonality.
Pythagorean orthogonality implies isosceles orthogonality.
If <math display="inline">x, y</math> are Pythagorean orthogonal, then so are <math display="inline">x, -y</math>.
Birkhoff–James orthogonality is symmetric.
If <math display="inline">\|x\|=\|y\|</math> and <math display="inline">t, s</math> are real, then <math display="inline">\|t x+s y\|=\|s x+t y\|</math>.

For the real vector space, there is also the condition:

Any two-dimensional slice of the unit sphere is an ellipse, that is, parameterizable as <math display="inline">\{x \cos\theta + y \sin\theta : \theta \in [0, 2\pi]\}</math>, for some unit vectors <math display="inline">x, y</math>.

The Banach-Mazur rotation problem: Given a separable Banach space <math display="inline">V</math> such that for any two unit vectors <math display="inline">x, y,</math> there exists a linear surjective isometry <math display="inline">T</math> such that <math display="inline">T(x) = y</math> or <math display="inline">T(y) = x</math>, is <math display="inline">V</math> isometrically isomorphic to a Hilbert space?

The general case of the problem is open. When the space is parable finite-dimensional, the answer is yes. In other words, given a finite-dimensional normed vector space over the real or complex numbers, if any point on the unit sphere can be mapped (rotated) to any other point by a linear isometry, then the norm is induced by an inner product.

Symmetric bilinear forms

The polarization identities are not restricted to inner products.

If <math>B</math> is any symmetric bilinear form on a vector space, and <math>Q</math> is the quadratic form defined by

then

<math display=block>\begin{align}

2 B(u, v) &= Q(u + v) - Q(u) - Q(v), \\

2 B(u, v) &= Q(u) + Q(v) - Q(u - v), \\

4 B(u, v) &= Q(u + v) - Q(u - v).

\end{align}</math>

The so-called symmetrization map generalizes the latter formula, replacing <math>Q</math> by a homogeneous polynomial of degree <math>k</math> defined by <math>Q(v) = B(v, \ldots, v),</math> where <math>B</math> is a symmetric <math>k</math>-linear map.

The formulas above even apply in the case where the field of scalars has characteristic two, though the left-hand sides are all zero in this case.

Consequently, in characteristic two there is no formula for a symmetric bilinear form in terms of a quadratic form, and they are in fact distinct notions, a fact which has important consequences in L-theory; for brevity, in this context "symmetric bilinear forms" are often referred to as "symmetric forms".

These formulas also apply to bilinear forms on modules over a commutative ring, though again one can only solve for <math>B(u, v)</math> if 2 is invertible in the ring, and otherwise these are distinct notions. For example, over the integers, one distinguishes integral quadratic forms from integral forms, which are a narrower notion.

More generally, in the presence of a ring involution or where 2 is not invertible, one distinguishes <math>\varepsilon</math>-quadratic forms and <math>\varepsilon</math>-symmetric forms; a symmetric form defines a quadratic form, and the polarization identity (without a factor of 2) from a quadratic form to a symmetric form is called the "symmetrization map", and is not in general an isomorphism. This has historically been a subtle distinction: over the integers it was not until the 1950s that relation between "twos out" (integral form) and "twos in" (integral form) was understood – see discussion at integral quadratic form; and in the algebraization of surgery theory, Mishchenko originally used L-groups, rather than the correct L-groups (as in Wall and Ranicki) – see discussion at L-theory.

Homogeneous polynomials of higher degree

Finally, in any of these contexts these identities may be extended to homogeneous polynomials (that is, algebraic forms) of arbitrary degree, where it is known as the polarization formula, and is reviewed in greater detail in the article on the polarization of an algebraic form.

Notes and references

Bibliography