<!-- need multiple textbook references -->

In mathematics, an ordered basis of a vector space of finite dimension allows representing uniquely any element of the vector space by a coordinate vector, which is a finite sequence of scalars called coordinates. If two different bases are considered, the coordinate vector that represents a vector on one basis is, in general, different from the coordinate vector that represents on the other basis. A change of basis consists of converting every assertion expressed in terms of coordinates relative to one basis into an assertion expressed in terms of coordinates relative to the other basis.

Such a conversion results from the change-of-basis formula, which expresses the coordinates relative to one basis in terms of the coordinates relative to the other basis. Using matrices, this formula can be written

:<math>\mathbf{x}_\mathrm{old} = A ~ \mathbf{x}_\mathrm{new},</math>

where <math>\mathbf{x}_\mathrm{old}</math> and <math>\mathbf{x}_\mathrm{new}</math> are the column vectors of the coordinates of the same vector on the "old" (initially defined) and "new" (other) bases. <math>A</math> is the change-of-basis matrix (also called transition matrix), which is the matrix whose columns are the coordinates of the "new" basis vectors on the "old" basis.

[[File:Change-of-basis_coordinate_mapping_(matrix_representation).png|thumb|upright=1.2|The matrix <math>\mathrm{ A = \mathbf{[} \overrightarrow{a} \! _1 \mathbf{|} \overrightarrow{a} \! _2 \mathbf{]} }</math> maps coordinate vectors in the basis <math>\mathrm{ ( \overrightarrow{a} \! _1, \overrightarrow{a} \! _2 ) }</math> to coordinate vectors in the standard basis: <math>\mathrm{ \overrightarrow{x} = A \overrightarrow{x} \! _A }.</math>]]

thumb|upright=1.2|Geometric illustration of change of basis: the same vector expressed in two non-canonical bases and the standard basis, showing the relationship between coordinate representations.

A change of basis is sometimes called a change of coordinates, although it excludes many coordinate transformations. For applications in physics and specially in mechanics, a change of basis often involves the transformation of an orthonormal basis, understood as a rotation in physical space, thus excluding translations.

This article deals mainly with finite-dimensional vector spaces. However, many of the presented principles are also valid for infinite-dimensional vector spaces.

Change-of-basis formula

Let <math>B_\mathrm{old} = (v_1, \ldots, v_n)</math> be a basis of a finite-dimensional vector space over a field .

For , one can define a vector by its coordinates <math>a_{i,j}</math> over <math>B_\mathrm{old} \colon</math>

:<math>w_j = \sum_{i=1}^n a_{i,j} v_i.</math>

Let

:<math>A = \left( a_{i,j} \right)_{i,j}</math>

be the matrix whose -th column is formed by the coordinates of . (Here and in what follows, the index refers always to the rows of and the <math>v_i,</math> while the index refers always to the columns of and the <math>w_j;</math> such a convention is useful for avoiding errors in explicit computations.)

Setting <math>B_\mathrm{new} = (w_1, \ldots, w_n),</math> one has that <math>B_\mathrm{new}</math> is a basis of if and only if the matrix is invertible, or equivalently, if it has a nonzero determinant. In this case, is said to be the change-of-basis matrix from the basis <math>B_\mathrm{old}</math> to the basis <math>B_\mathrm{new}.</math>

Given a vector <math>u \in V,</math> let <math>(x_1, \ldots, x_n)</math> be the coordinates of <math>u</math> over <math>B_\mathrm{old},</math> and <math>(y_1, \ldots, y_n)</math> its coordinates over <math>B_\mathrm{new};</math> that is

:<math>u = \sum_{i=1}^n x_i v_i = \sum_{j=1}^n y_j w_j.</math>

(One could take the same summation index for the two sums, but choosing systematically the indexes for the old basis and for the new one makes clearer the formulas that follows, and helps avoiding errors in proofs and explicit computations.)

The change-of-basis formula expresses the coordinates over the old basis in terms of the coordinates over the new basis. With above notation, it is

:<math>x_i = \sum_{j=1}^n a_{i,j} y_j \qquad \text{for } i=1, \ldots, n.</math>

In terms of matrices, the change-of-basis formula is

:<math>\mathbf{x} = A ~ \mathbf{y},</math>

where <math>\mathbf{x}</math> and <math>\mathbf{y}</math> are the column vectors of the coordinates of <math>u</math> over <math>B_\mathrm{old}</math> and <math>B_\mathrm{new},</math> respectively. (This reverse terminology is confusing, but internationally adopted.)

Proof: Using the above definition of the change-of-basis matrix, one has

:<math>\begin{align}

u &= \sum_{j=1}^n y_j w_j\\

&= \sum_{j=1}^n \left( y_j \sum_{i=1}^n a_{i,j} v_i \right)\\

&= \sum_{i=1}^n \left( \sum_{j=1}^n a_{i,j} y_j \right) v_i.

\end{align}</math>

As <math>u = \textstyle \sum_{i=1}^n x_i v_i,</math> the change-of-basis formula results from the uniqueness of the decomposition of a vector over a basis.

Example

Consider the Euclidean vector space <math>\R^2</math> and its standard basis, consisting of the vectors <math>v_1</math> and <math>v_2</math> with column vectors

:<math>\mathbf{v_1} = \begin{bmatrix}

1\\

0

\end{bmatrix} \quad</math> and <math>\quad \mathbf{v_2} = \begin{bmatrix}

0\\

1

\end{bmatrix}.</math>

Rotating them by an angle of <math>t</math> gives a "new" basis, formed by the vectors <math>w_1</math> and <math>w_2</math> with column vectors

:<math>\mathbf{w_1} =

\begin{bmatrix}

\cos t\\

\sin t

\end{bmatrix} \quad</math> and <math>\quad \mathbf{w_2} =

\begin{bmatrix}

-\sin t\\

\cos t

\end{bmatrix}.</math>

So, the change-of-basis matrix is

:<math>\begin{bmatrix}

\cos t & -\sin t\\

\sin t & \cos t

\end{bmatrix}.</math>

The change-of-basis formula asserts that, for a vector with "old" coordinates <math>(x_1, x_2)</math> and "new" coordinates <math>(y_1, y_2),</math> one has

:<math>\begin{bmatrix}

x_1\\

x_2

\end{bmatrix} =

\begin{bmatrix}

\cos t & -\sin t\\

\sin t & \cos t

\end{bmatrix} ~

\begin{bmatrix}

y_1\\

y_2

\end{bmatrix}.</math>

That is,

:<math>\begin{cases}

x_1 = y_1 \cos t - y_2 \sin t\\

x_2 = y_1 \sin t + y_2 \cos t.

\end{cases}</math>

This may be shown by writing

:<math>\begin{align}

y_1 w_1 + y_2 w_2 &= y_1 \Big( (\cos t) ~ v_1 + (\sin t) ~ v_2 \Big) + y_2 \Big( \! -(\sin t) ~ v_1 + (\cos t) ~ v_2 \Big)\\

&= \big( y_1 \cos t - y_2 \sin t \big) v_1 + \big( y_1 \sin t + y_2 \cos t \big) v_2\\

&= x_1 v_1 + x_2 v_2.

\end{align}</math>

In terms of linear maps

Usually, a matrix represents a linear map, and the product of a matrix and a column vector represents the function application of the corresponding linear map to the vector whose coordinates form the column vector. The change-of-basis formula is a specific case of this general principle, although this is not immediately clear from its definition and proof.

When one says that a matrix represents a linear map, one refers implicitly to bases of implied vector spaces, and to the fact that the choice of a basis induces a linear isomorphism between a vector space and <math>F^n</math>, where is the ground field of scalars. When only one basis is considered for each vector space, it is convenient to leave this isomorphism implicit, and to work up to an isomorphism. As several bases of the same vector space are considered here, a more accurate wording is required.

Let be a field, the set <math>F^n</math> of the -tuples is an -vector space whose addition and scalar multiplication are defined component-wise. Its standard basis is the basis that has as its -th element the tuple with all components equal to except the -th one, equal to .

A basis <math>B = (v_1, \ldots, v_n)</math> of an -vector space defines a linear isomorphism <math>\varphi \colon F^n \to V</math> by

:<math>\varphi(x_1, \ldots, x_n) = \sum_{i=1}^n x_i v_i.</math>

Conversely, such a linear isomorphism defines a basis, which is the image by <math>\varphi</math> of the standard basis of <math>F^n.</math>

Let <math>B_\mathrm{old} = (v_1, \ldots, v_n)</math> be the "old" basis of a change of basis, and <math>\varphi_\mathrm{old}</math> the associated isomorphism. Given a change-of-basis (invertible) matrix , one could consider it the matrix of an automorphism (a bijective endomorphism) <math>\psi_A</math> of <math>F^n.</math> Finally, define

:<math>\varphi_\mathrm{new} = \varphi_\mathrm{old} \circ \psi_A</math>

(where <math>\circ</math> denotes function composition), and

:<math>B_\mathrm{new} = \varphi_\mathrm{new} \left( \varphi_\mathrm{old}^{-1}(B_\mathrm{old}) \right).</math>

A straightforward verification shows that this definition of <math>B_\mathrm{new}</math> is the same as that in the preceding section.

Now, by composing the equation <math>\varphi_\mathrm{new} = \varphi_\mathrm{old} \circ \psi_A</math> with <math>\varphi_\mathrm{old}^{-1}</math> on the left and <math>\varphi_\mathrm{new}^{-1}</math> on the right, one gets

:<math>\varphi_\mathrm{old}^{-1} = \psi_A \circ \varphi_\mathrm{new}^{-1}.</math>

It follows that, for <math>v \in V,</math> one has

:<math>\varphi_\mathrm{old}^{-1}(v) = \psi_A \left( \varphi_\mathrm{new}^{-1}(v) \right),</math>

which is the change-of-basis formula expressed in terms of linear maps instead of coordinates.

Function defined on a vector space

A function that has a vector space as its domain is commonly specified as a multivariate function whose variables are the coordinates on some basis of the vector on which the function is applied.

When the basis is changed, the expression of the function is changed. This change can be computed by substituting the "old" coordinates for their expressions in terms of the "new" coordinates. More precisely, if is the expression of the function in terms of the "old" coordinates, and if is the change-of-basis formula, then is the expression of the same function in terms of the "new" coordinates.

The fact that the change-of-basis formula expresses the "old" coordinates in terms of the "new" ones may seem unnatural, but appears as useful, because no matrix inversion is needed here.

As the change-of-basis formula involves only linear functions, many function properties are kept by a change of basis. This allows defining these properties as properties of functions of a variable vector that are not related to any specific basis. So, a function whose domain is a vector space or a subset of it is

  • a linear function,
  • a polynomial function,
  • a continuous function,
  • a differentiable function,
  • a smooth function,
  • an analytic function,

if the multivariate function that represents it on some basisand thus on every basishas the same property.

This is specially useful in the theory of manifolds, as this allows extending the concepts of continuous, differentiable, smooth, and analytic functions to functions that are defined on a manifold.

Linear maps

Consider a linear map from a vector space of dimension to a vector space of dimension . It is represented on "old" bases of and by an matrix . A change of basis is defined by an change-of-basis matrix for , and an change-of-basis matrix for .

On the "new" bases, the matrix representation of is

:<math>M' = Q^{-1}MP.</math>

This is a straightforward consequence of the change-of-basis formula.

Endomorphisms

Endomorphisms are linear maps from a vector space to itself. For a change of basis, the formula of the preceding section applies, with the same change-of-basis matrix on both sides of the formula. That is, if is the square matrix of an endomorphism of on an "old" basis, and is a change-of-basis matrix, then the matrix of the endomorphism on the "new" basis is

:<math>M' = P^{-1}MP.</math>

As every invertible matrix can be used as a change-of-basis matrix, this implies that two matrices are similar if and only if they represent the same endomorphism on two different bases.

Bilinear forms

A bilinear form on a vector space over a field is a function <math>\varPhi \colon V \! \times V \to F</math> which is linear in both arguments. That is, <math>\varPhi</math> is bilinear if the maps <math>v \mapsto \varPhi(v, w)</math> and <math>v \mapsto \varPhi(w, v)</math> are linear for every fixed <math>w \in V.</math>

The matrix <math>\mathbf{\Phi}</math> of a bilinear form <math>\varPhi</math> on a basis <math>(v_1, \ldots, v_n)</math> (the "old" basis) is the matrix whose entry of the -th row and -th column is <math>\varPhi(v_i, v_j)</math>. It follows that for two vectors and with coordinate column vectors and , one has

:<math>\varPhi(v, w) = \mathbf{v}^{\mathsf T} \mathbf{\Phi} \mathbf{w},</math>

where <math>\mathbf{v}^{\mathsf T}</math> denotes the transpose of the column vector .

For a change of basis with matrix , a straightforward computation shows that the matrix of the bilinear form on the "new" basis is

:<math>\mathbf{\Phi'} = P^{\mathsf T} \mathbf{\Phi} P.</math>

A symmetric bilinear form is a bilinear form such that <math>S(v,w) = S(w,v)</math> for every and in . It follows that the matrix of on any basis is symmetric. This implies that the property of being a symmetric matrix must be kept by the above change-of-basis formula. One can also check this by noting that the transpose of a matrix product is the product of the transposes computed in the reverse order. Thus,

:<math>(P^{\mathsf T} \mathbf{S} P)^{\mathsf T} = P^{\mathsf T} \mathbf{S}^{\mathsf T} P;</math>

finally, since the matrix is symmetric,

:<math>(P^{\mathsf T} \mathbf{S} P)^{\mathsf T} = P^{\mathsf T} \mathbf{S} P.</math>

If the characteristic of the ground field is not two, then for every symmetric bilinear form, there is a basis for which the matrix is diagonal. Moreover, the resulting nonzero entries on the diagonal are defined up to the multiplication by a square. So, if the ground field is the field <math>\R</math> of the real numbers, these nonzero entries can be chosen to be either or . Sylvester's law of inertia is a theorem that asserts that the numbers of and of depend only on the bilinear form, and not on the change of basis.

Symmetric bilinear forms over the reals are often encountered in geometry and physics, typically in the study of quadrics and of the inertia of a rigid body. In these cases, orthonormal bases are specially useful; this means that one generally prefers to restrict changes of basis to those that have an orthogonal change-of-basis matrix, that is, a matrix such that <math>P^{\mathsf T} = P^{-1}.</math> Such matrices have the fundamental property that the change-of-basis formula is the same for a symmetric bilinear form and the endomorphism that is represented by the same symmetric matrix. The Spectral theorem asserts that, given such a symmetric matrix, there is an orthogonal change of basis such that the resulting matrix (of both the bilinear form and the endomorphism) is a diagonal matrix with the eigenvalues of the initial matrix on the diagonal. It follows that, over the reals, if the matrix of an endomorphism is symmetric, then it is diagonalizable.

See also

  • Active and passive transformation
  • Covariance and contravariance of vectors
  • Integral transform, the continuous analogue of change of basis
  • Chirgwin-Coulson weights&nbsp;&mdash; application in computational chemistry

Notes

References

Bibliography

  • MIT Linear Algebra Lecture on Change of Basis, from MIT OpenCourseWare
  • Khan Academy Lecture on Change of Basis, from Khan Academy