Matrix multiplication

thumb|For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The result matrix has the number of rows of the first and the number of columns of the second matrix.

In mathematics, specifically in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices and is denoted as .

Matrix multiplication was first described by the French mathematician Jacques Philippe Marie Binet in 1812, to represent the composition of linear maps that are represented by matrices. Matrix multiplication is thus a basic tool of linear algebra, and as such has numerous applications in many areas of mathematics, as well as in applied mathematics, statistics, physics, economics, and engineering.

Computing matrix products is a central operation in all computational applications of linear algebra.

Notation

This article will use the following notational conventions: matrices are represented by capital letters in bold, e.g. ; vectors in lowercase bold, e.g. ; and entries of vectors and matrices are italic (they are numbers from a field), e.g. and . Index notation is often the clearest way to express definitions, and is used as standard in the literature. The entry in row , column of matrix is indicated by , or . In contrast, a single subscript, e.g. , is used to select a matrix (not a matrix entry) from a collection of matrices.

Definitions

Matrix times matrix

If is an matrix and is an matrix,

<math display="block">\mathbf{A}=\begin{pmatrix}

a_{11} & a_{12} & \cdots & a_{1n} \\

a_{21} & a_{22} & \cdots & a_{2n} \\

\vdots & \vdots & \ddots & \vdots \\

a_{m1} & a_{m2} & \cdots & a_{mn} \\

\end{pmatrix},\quad\mathbf{B}=\begin{pmatrix}

b_{11} & b_{12} & \cdots & b_{1p} \\

b_{21} & b_{22} & \cdots & b_{2p} \\

\vdots & \vdots & \ddots & \vdots \\

b_{n1} & b_{n2} & \cdots & b_{np} \\

\end{pmatrix}</math>

the matrix product (denoted without multiplication signs or dots) is defined to be the matrix

<math display="block">\mathbf{C} = \begin{pmatrix}

c_{11} & c_{12} & \cdots & c_{1p} \\

c_{21} & c_{22} & \cdots & c_{2p} \\

\vdots & \vdots & \ddots & \vdots \\

c_{m1} & c_{m2} & \cdots & c_{mp} \\

\end{pmatrix}</math>

such that

<math display="block"> c_{ij} = a_{i1} b_{1j} + a_{i2} b_{2j} + \cdots + a_{in} b_{nj} = \sum_{k=1}^n a_{ik} b_{kj}, </math>

for and .

That is, the entry of the product is obtained by multiplying term-by-term the entries of the th row of and the th column of , and summing these products. In other words, is the dot product of the th row of and the th column of .

Therefore, can also be written as

<math display="block">\mathbf{C} = \begin{pmatrix}

a_{11}b_{11} +\cdots + a_{1n}b_{n1} & a_{11}b_{12} +\cdots + a_{1n}b_{n2} & \cdots & a_{11}b_{1p} +\cdots + a_{1n}b_{np} \\

a_{21}b_{11} +\cdots + a_{2n}b_{n1} & a_{21}b_{12} +\cdots + a_{2n}b_{n2} & \cdots & a_{21}b_{1p} +\cdots + a_{2n}b_{np} \\

\vdots & \vdots & \ddots & \vdots \\

a_{m1}b_{11} +\cdots + a_{mn}b_{n1} & a_{m1}b_{12} +\cdots + a_{mn}b_{n2} & \cdots & a_{m1}b_{1p} +\cdots + a_{mn}b_{np} \\

\end{pmatrix}

</math>

Thus the product is defined if and only if the number of columns in equals the number of rows in ,

System of linear equations

The general form of a system of linear equations is

:<math>\begin{matrix}a_{11}x_1+\cdots + a_{1n}x_n=b_1,

\\ a_{21}x_1+\cdots + a_{2n}x_n =b_2,

\\ \vdots

\\ a_{m1}x_1+\cdots + a_{mn}x_n =b_m. \end{matrix}</math>

Using same notation as above, such a system is equivalent with the single matrix equation

:<math>\mathbf{Ax}=\mathbf b.</math>

Dot product, bilinear form and sesquilinear form

The dot product of two column vectors is the unique entry of the matrix product

:<math>\mathbf x^\mathsf T \mathbf y,</math>

where <math>\mathbf x^\mathsf T</math> is the row vector obtained by transposing <math>\mathbf x</math>. (As usual, a 1×1 matrix is identified with its unique entry.)

More generally, any bilinear form over a vector space of finite dimension may be expressed as a matrix product

:<math>\mathbf x^\mathsf T \mathbf {Ay},</math>

and any sesquilinear form may be expressed as

:<math>\mathbf x^\dagger \mathbf {Ay},</math>

where <math>\mathbf x^\dagger</math> denotes the conjugate transpose of <math>\mathbf x</math> (conjugate of the transpose, or equivalently transpose of the conjugate).

General properties

Matrix multiplication shares some properties with usual multiplication. However, matrix multiplication is not defined if the number of columns of the first factor differs from the number of rows of the second factor, and it is non-commutative, even when the product remains defined after changing the order of the factors.

Non-commutativity

An operation is commutative if, given two elements and such that the product <math>\mathbf{A}\mathbf{B}</math> is defined, then <math>\mathbf{B}\mathbf{A}</math> is also defined, and <math>\mathbf{A}\mathbf{B}=\mathbf{B}\mathbf{A}.</math>

If and are matrices of respective sizes and , then <math>\mathbf{A}\mathbf{B}</math> is defined if , and <math>\mathbf{B}\mathbf{A}</math> is defined if . Therefore, if one of the products is defined, the other one need not be defined. If , the two products are defined, but have different sizes; thus they cannot be equal. Only if , that is, if and are square matrices of the same size, are both products defined and of the same size. Even in this case, one has in general

:<math>\mathbf{A}\mathbf{B} \neq \mathbf{B}\mathbf{A}.</math>

For example

:<math>\begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}\begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}=\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix},</math>

but

:<math>\begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}\begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}.</math>

This example may be expanded for showing that, if is a matrix with entries in a field , then <math>\mathbf{A}\mathbf{B} = \mathbf{B}\mathbf{A}</math> for every matrix with entries in , if and only if <math>\mathbf{A}=c\,\mathbf{I}</math> where , and is the identity matrix. If, instead of a field, the entries are supposed to belong to a ring, then one must add the condition that belongs to the center of the ring.

One special case where commutativity does occur is when and are two (square) diagonal matrices (of the same size); then .

Application to similarity

Any invertible matrix <math>\mathbf{P}</math> defines a similarity transformation (on square matrices of the same size as <math>\mathbf{P}</math>)

:<math>S_\mathbf{P}(\mathbf{A}) = \mathbf{P}^{-1} \mathbf{A} \mathbf{P}.</math>

Similarity transformations map product to products, that is

:<math>S_\mathbf{P}(\mathbf{AB}) = S_\mathbf{P}(\mathbf{A})S_\mathbf{P}(\mathbf{B}).</math>

In fact, one has

:<math>\mathbf{P}^{-1} (\mathbf{AB}) \mathbf{P}

= \mathbf{P}^{-1} \mathbf{A}(\mathbf{P}\mathbf{P}^{-1})\mathbf{B} \mathbf{P}

=(\mathbf{P}^{-1} \mathbf{A}\mathbf{P})(\mathbf{P}^{-1}\mathbf{B} \mathbf{P}).</math>

Square matrices

Let us denote <math>\mathcal M_n(R)</math> the set of square matrices with entries in a ring , which, in practice, is often a field.

In <math>\mathcal M_n(R)</math>, the product is defined for every pair of matrices. This makes <math>\mathcal M_n(R)</math> a ring, which has the identity matrix as an identity element (the matrix whose diagonal entries are equal to 1 and all other entries are 0). This ring is also an associative -algebra.

If , many matrices do not have a multiplicative inverse. For example, a matrix such that all entries of a row (or a column) are 0 does not have an inverse. If it exists, the inverse of a matrix is denoted , and, thus verifies

:<math> \mathbf{A}\mathbf{A}^{-1} = \mathbf{A}^{-1}\mathbf{A} = \mathbf{I}. </math>

A matrix that has an inverse is an invertible matrix. Otherwise, it is a singular matrix.

A product of matrices is invertible if and only if each factor is invertible. In this case, one has

:<math>(\mathbf{A}\mathbf{B})^{-1} = \mathbf{B}^{-1}\mathbf{A}^{-1}.</math>

When is commutative, and, in particular, when it is a field, the determinant of a product is the product of the determinants. As determinants are scalars, and scalars commute, one has thus

:<math> \det(\mathbf{AB}) = \det(\mathbf{BA}) =\det(\mathbf{A})\det(\mathbf{B}). </math>

The other matrix invariants do not behave as well with products. Nevertheless, if is commutative, and have the same trace, the same characteristic polynomial, and the same eigenvalues with the same multiplicities. However, the eigenvectors are generally different if .

Powers of a matrix

One may raise a square matrix to any nonnegative integer power multiplying it by itself repeatedly in the same way as for ordinary numbers. That is,

:<math>\mathbf{A}^0 = \mathbf{I},</math>

:<math>\mathbf{A}^1 = \mathbf{A},</math>

:<math>\mathbf{A}^k = \underbrace{\mathbf{A}\mathbf{A}\cdots\mathbf{A_{k\text{ times.</math>

Computing the th power of a matrix needs times the time of a single matrix multiplication, if it is done with the trivial algorithm (repeated multiplication). As this may be very time consuming, one generally prefers using exponentiation by squaring, which requires less than matrix multiplications, and is therefore much more efficient.

An easy case for exponentiation is that of a diagonal matrix. Since the product of diagonal matrices amounts to simply multiplying corresponding diagonal elements together, the th power of a diagonal matrix is obtained by raising the entries to the power :

:<math>

\begin{bmatrix}

a_{11} & 0 & \cdots & 0 \\

0 & a_{22} & \cdots & 0 \\

\vdots & \vdots & \ddots & \vdots \\

0 & 0 & \cdots & a_{nn}

\end{bmatrix}^k =

\begin{bmatrix}

a_{11}^k & 0 & \cdots & 0 \\

0 & a_{22}^k & \cdots & 0 \\

\vdots & \vdots & \ddots & \vdots \\

0 & 0 & \cdots & a_{nn}^k

\end{bmatrix}.

</math>

Abstract algebra

The definition of matrix product requires that the entries belong to a semiring, and does not require multiplication of elements of the semiring to be commutative. In many applications, the matrix elements belong to a field, although the tropical semiring is also a common choice for graph shortest path problems. Even in the case of matrices over fields, the product is not commutative in general, although it is associative and is distributive over matrix addition. The identity matrices (which are the square matrices whose entries are zero outside of the main diagonal and 1 on the main diagonal) are identity elements of the matrix product. It follows that the matrices over a ring form a ring, which is noncommutative except if and the ground ring is commutative.

A square matrix may have a multiplicative inverse, called an inverse matrix. In the common case where the entries belong to a commutative ring , a matrix has an inverse if and only if its determinant has a multiplicative inverse in . The determinant of a product of square matrices is the product of the determinants of the factors. The matrices that have an inverse form a group under matrix multiplication, the subgroups of which are called matrix groups. Many classical groups (including all finite groups) are isomorphic to matrix groups; this is the starting point of the theory of group representations.

Matrices are the morphisms of a category, the category of matrices. The objects are the natural numbers that measure the size of matrices, and the composition of morphisms is matrix multiplication. The source of a morphism is the number of columns of the corresponding matrix, and the target is the number of rows.

Computational complexity

thumb|400px|right|Improvement of estimates of exponent over time for the computational complexity of matrix multiplication <math>O(n^\omega)</math>

The matrix multiplication algorithm that results from the definition requires, in the worst case, multiplications and additions of scalars to compute the product of two square matrices. Its computational complexity is therefore , in a model of computation for which the scalar operations take constant time.

Rather surprisingly, this complexity is not optimal, as shown in 1969 by Volker Strassen, who provided an algorithm, now called Strassen's algorithm, with a complexity of <math>O( n^{\log_{2}7}) \approx O(n^{2.8074}).</math>

Strassen's algorithm can be parallelized to further improve the performance.

, the best peer-reviewed matrix multiplication algorithm is by Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou and has complexity .

It is not known whether matrix multiplication can be performed in time. This would be optimal, since one must read the elements of a matrix in order to multiply it with another matrix.

Since matrix multiplication forms the basis for many algorithms, and many operations on matrices even have the same complexity as matrix multiplication (up to a multiplicative constant), the computational complexity of matrix multiplication appears throughout numerical linear algebra and theoretical computer science.

Generalizations

Other types of products of matrices include:

Block matrix operations
Cracovian product, defined as
Frobenius inner product, the dot product of matrices considered as vectors, or, equivalently the sum of the entries of the Hadamard product
Hadamard product of two matrices of the same size, resulting in a matrix of the same size, which is the product entry-by-entry
Kronecker product or tensor product, the generalization to any size of the preceding
Khatri–Rao product and face-splitting product
Outer product, also called dyadic product or tensor product of two column matrices, which is <math>\mathbf{a}\mathbf{b}^\mathsf{T}</math>
Scalar multiplication

Notes

References

Henry Cohn, Robert Kleinberg, Balázs Szegedy, and Chris Umans. Group-theoretic Algorithms for Matrix Multiplication. . Proceedings of the 46th Annual Symposium on Foundations of Computer Science, 23–25 October 2005, Pittsburgh, PA, IEEE Computer Society, pp. 379–388.
Henry Cohn, Chris Umans. A Group-theoretic Approach to Fast Matrix Multiplication. . Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 11–14 October 2003, Cambridge, MA, IEEE Computer Society, pp. 438–449.
Knuth, D.E., The Art of Computer Programming Volume 2: Seminumerical Algorithms. Addison-Wesley Professional; 3 edition (November 14, 1997). . pp. 501.
.
Ran Raz. On the complexity of matrix product. In Proceedings of the thirty-fourth annual ACM symposium on Theory of computing. ACM Press, 2002. .
Robinson, Sara, Toward an Optimal Algorithm for Matrix Multiplication, SIAM News 38(9), November 2005. PDF
Strassen, Volker, Gaussian Elimination is not Optimal, Numer. Math. 13, p. 354–356, 1969.

Notation

Definitions

Matrix times matrix

System of linear equations

Dot product, bilinear form and sesquilinear form

General properties

Non-commutativity

Application to similarity

Square matrices

Powers of a matrix

Abstract algebra

Computational complexity

Generalizations

See also

Notes

References