Kronecker product

In mathematics, the Kronecker product, sometimes denoted by &otimes;, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product (which is denoted by the same symbol) from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.

The Kronecker product is named after the German mathematician Leopold Kronecker (1823–1891), even though there is little evidence that he was the first to define and use it. The Kronecker product has also been called the Zehfuss matrix, and the Zehfuss product, after , who in 1858 described this matrix operation, but Kronecker product is currently the most widely used term. The misattribution to Kronecker rather than Zehfuss was due to Kurt Hensel.

Definition

If A is an matrix and B is a matrix, then the Kronecker product is the block matrix:

: <math>\mathbf{A}\otimes\mathbf{B} = \begin{bmatrix}

a_{11} \mathbf{B} & \cdots & a_{1n}\mathbf{B} \\

\vdots & \ddots & \vdots \\

a_{m1} \mathbf{B} & \cdots & a_{mn} \mathbf{B}

\end{bmatrix}, </math>

more explicitly:

: <math>{\mathbf{A}\otimes\mathbf{B = \begin{bmatrix}

a_{11} b_{11} & a_{11} b_{12} & \cdots & a_{11} b_{1q} &

\cdots & \cdots & a_{1n} b_{11} & a_{1n} b_{12} & \cdots & a_{1n} b_{1q} \\

a_{11} b_{21} & a_{11} b_{22} & \cdots & a_{11} b_{2q} &

\cdots & \cdots & a_{1n} b_{21} & a_{1n} b_{22} & \cdots & a_{1n} b_{2q} \\

\vdots & \vdots & \ddots & \vdots & & & \vdots & \vdots & \ddots & \vdots \\

a_{11} b_{p1} & a_{11} b_{p2} & \cdots & a_{11} b_{pq} &

\cdots & \cdots & a_{1n} b_{p1} & a_{1n} b_{p2} & \cdots & a_{1n} b_{pq} \\

\vdots & \vdots & & \vdots & \ddots & & \vdots & \vdots & & \vdots \\

\vdots & \vdots & & \vdots & & \ddots & \vdots & \vdots & & \vdots \\

a_{m1} b_{11} & a_{m1} b_{12} & \cdots & a_{m1} b_{1q} &

\cdots & \cdots & a_{mn} b_{11} & a_{mn} b_{12} & \cdots & a_{mn} b_{1q} \\

a_{m1} b_{21} & a_{m1} b_{22} & \cdots & a_{m1} b_{2q} &

\cdots & \cdots & a_{mn} b_{21} & a_{mn} b_{22} & \cdots & a_{mn} b_{2q} \\

\vdots & \vdots & \ddots & \vdots & & & \vdots & \vdots & \ddots & \vdots \\

a_{m1} b_{p1} & a_{m1} b_{p2} & \cdots & a_{m1} b_{pq} &

\cdots & \cdots & a_{mn} b_{p1} & a_{mn} b_{p2} & \cdots & a_{mn} b_{pq}

\end{bmatrix}. </math>

Using <math>/\!/</math> and <math>\%</math> to denote truncating integer division and remainder, respectively, and numbering the matrix elements starting from 0, one obtains

: <math>(A\otimes B)_{pr+v, qs+w} = a_{rs} b_{vw}</math>

: <math>(A\otimes B)_{i, j} = a_{i /\!/ p, j /\!/ q} b_{i \%p, j \% q}.</math>

For the usual numbering starting from 1, one obtains

: <math>

(A\otimes B)_{p(r-1)+v, q(s-1)+w} = a_{rs} b_{vw}

</math>

: <math>

(A\otimes B)_{i, j} = a_{\lceil i/p \rceil,\lceil j/q \rceil} b_{(i-1)\%p +1, (j-1)\%q + 1}.

</math>

If A and B represent linear transformations and , respectively, then the tensor product of the two maps is a map represented by .

Examples

: <math>

\begin{bmatrix}

1 & 2 \\

3 & 4 \\

\end{bmatrix} \otimes

\begin{bmatrix}

0 & 5 \\

6 & 7 \\

\end{bmatrix} =

\begin{bmatrix}

1 \begin{bmatrix}

0 & 5 \\

6 & 7 \\

\end{bmatrix} &

2 \begin{bmatrix}

0 & 5 \\

6 & 7 \\

\end{bmatrix} \\

3 \begin{bmatrix}

0 & 5 \\

6 & 7 \\

\end{bmatrix} &

4 \begin{bmatrix}

0 & 5 \\

6 & 7 \\

\end{bmatrix} \\

\end{bmatrix} =

\left[\begin{array}{cc|cc}

1\times 0 & 1\times 5 & 2\times 0 & 2\times 5 \\

1\times 6 & 1\times 7 & 2\times 6 & 2\times 7 \\ \hline

3\times 0 & 3\times 5 & 4\times 0 & 4\times 5 \\

3\times 6 & 3\times 7 & 4\times 6 & 4\times 7 \\

\end{array}\right] =

\left[\begin{array}{cc|cc}

0 & 5 & 0 & 10 \\

6 & 7 & 12 & 14 \\ \hline

0 & 15 & 0 & 20 \\

18 & 21 & 24 & 28

\end{array}\right].

</math>

Similarly:

: <math>

\begin{bmatrix}

1 & -4 & 7 \\

-2 & 3 & 3

\end{bmatrix} \otimes

\begin{bmatrix}

8 & -9 & -6 & 5 \\

1 & -3 & -4 & 7 \\

2 & 8 & -8 & -3 \\

1 & 2 & -5 & -1

\end{bmatrix} =

\left[\begin{array}{cccc|cccc|cccc}

8 & -9 & -6 & 5 & -32 & 36 & 24 & -20 & 56 & -63 & -42 & 35 \\

1 & -3 & -4 & 7 & -4 & 12 & 16 & -28 & 7 & -21 & -28 & 49 \\

2 & 8 & -8 & -3 & -8 & -32 & 32 & 12 & 14 & 56 & -56 & -21 \\

1 & 2 & -5 & -1 & -4 & -8 & 20 & 4 & 7 & 14 & -35 & -7 \\ \hline

-16 & 18 & 12 & -10 & 24 & -27 & -18 & 15 & 24 & -27 & -18 & 15 \\

-2 & 6 & 8 & -14 & 3 & -9 & -12 & 21 & 3 & -9 & -12 & 21 \\

-4 & -16 & 16 & 6 & 6 & 24 & -24 & -9 & 6 & 24 & -24 & -9 \\

-2 & -4 & 10 & 2 & 3 & 6 & -15 & -3 & 3 & 6 & -15 & -3

\end{array}\right]

</math>

Properties

Relations to other matrix operations

) = \exp(\mathbf{N}) \otimes \exp(\mathbf{M}) </math>

Kronecker sums appear naturally in physics when considering ensembles of non-interacting systems.

If Hr is the Hamiltonian of the rth such system. Then the total Hamiltonian of the ensemble is

: <math>H_{\operatorname{Tot=\overline{\bigoplus_r} H^r .</math>

| 10 = Vectorization of a Kronecker product:

Let <math>A</math> be an <math>m\times n</math> matrix and <math>B</math> a <math>p \times q</math> matrix. When the order of the Kronecker product and vectorization is interchanged, the two operations can be linked linearly through a function that involves the commutation matrix, <math>K_{qm} </math>. That is, <math>\operatorname{vec}(\operatorname{Kron}(A, B))</math> and <math>\operatorname{Kron}(\operatorname{vec}A,\operatorname{vec}B)</math> have the following relationship:

: <math>\operatorname{vec}(A\otimes B) = (I_n \otimes K_{qm}\otimes I_p)(\operatorname{vec}A \otimes \operatorname{vec}B). </math>

Furthermore, the above relation can be rearranged in terms of either <math>\operatorname{vec}A</math> or <math>\operatorname{vec}B</math> as follows:

: <math>\operatorname{vec}(A\otimes B)= (I_n \otimes G)\operatorname{vec}A = (H\otimes I_p)\operatorname{vec}B, </math>

where

: <math>G = (K_{qm}\otimes I_p)(I_m \otimes \operatorname{vec}B) \text{ and } H=(I_n\otimes K_{qm})(\operatorname{vec}A \otimes I_q). </math>

| 11 = Outer Product:

If <math>x \in \mathbb{R}^n</math> and <math>y \in \mathbb{R}^m</math> are arbitrary vectors, then the outer product between <math>x</math> and <math>y</math> is defined as <math>xy^T</math>. The Kronecker product is related to the outer product by: <math>y\otimes x = \operatorname{vec}(xy^T)</math>.

Abstract properties

Matrix equations

The Kronecker product can be used to get a convenient representation for some matrix equations. Consider for instance the equation , where A, B and C are given matrices and the matrix X is the unknown. We can use the "vec trick" to rewrite this equation as

: <math>

\left(\mathbf{B}^\textsf{T} \otimes \mathbf{A}\right) \, \operatorname{vec}(\mathbf{X})

= \operatorname{vec}(\mathbf{AXB}) = \operatorname{vec}(\mathbf{C})

.</math>

Here, vec(X) denotes the vectorization of the matrix X, formed by stacking the columns of X into a single column vector.

It now follows from the properties of the Kronecker product that the equation has a unique solution, if and only if A and B are invertible .

If X and C are row-ordered into the column vectors u and v, respectively, then

: <math>

\mathbf{v} =

\left(\mathbf{A} \otimes \mathbf{B}^\textsf{T}\right)\mathbf{u}

.</math>

The reason is that

: <math>

\mathbf{v} =

\operatorname{vec}\left((\mathbf{AXB})^\textsf{T}\right) =

\operatorname{vec}\left(\mathbf{B}^\textsf{T}\mathbf{X}^\textsf{T}\mathbf{A}^\textsf{T}\right) =

\left(\mathbf{A} \otimes \mathbf{B}^\textsf{T}\right)\operatorname{vec}\left(\mathbf{X^\textsf{T\right) =

\left(\mathbf{A} \otimes \mathbf{B}^\textsf{T}\right)\mathbf{u}

.</math>

Applications

For an example of the application of this formula, see the article on the Lyapunov equation. This formula also comes in handy in showing that the matrix normal distribution is a special case of the multivariate normal distribution. This formula is also useful for representing 2D image processing operations in matrix-vector form.

Another example is when a matrix can be factored as a Kronecker product, then matrix multiplication can be performed faster by using the above formula. This can be applied recursively, as done in the radix-2 FFT and the Fast Walsh–Hadamard transform. Splitting a known matrix into the Kronecker product of two smaller matrices is known as the "nearest Kronecker product" problem, and can be solved exactly by using the SVD. To split a matrix into the Kronecker product of more than two matrices, in an optimal fashion, is a difficult problem and the subject of ongoing research; some authors cast it as a tensor decomposition problem.

In conjunction with the least squares method, the Kronecker product can be used as an accurate solution to the hand–eye calibration problem.

Two related matrix operations are the Tracy–Singh and Khatri–Rao products, which operate on partitioned matrices. Let the matrix A be partitioned into the blocks Aij and matrix B into the blocks Bkl, with of course , , and .

Tracy–Singh product

The Tracy–Singh product is defined as

: <math>\mathbf{A} \circ \mathbf{B} = \left(\mathbf{A}_{ij} \circ \mathbf{B}\right)_{ij} = \left(\left(\mathbf{A}_{ij} \otimes \mathbf{B}_{kl}\right)_{kl}\right)_{ij}</math>

which means that the (ij)-th subblock of the product is the matrix , of which the (k)-th subblock equals the matrix . Essentially the Tracy–Singh product is the pairwise Kronecker product for each pair of partitions in the two matrices.

For example, if A and B both are partitioned matrices e.g.:

: <math> \mathbf{A} =

\left[

\begin{array} {c | c}

\mathbf{A}_{11} & \mathbf{A}_{12} \\

\hline

\mathbf{A}_{21} & \mathbf{A}_{22}

\end{array}

\right]

\left[

\begin{array} {c c | c}

1 & 2 & 3 \\

4 & 5 & 6 \\

\hline

7 & 8 & 9

\end{array}

\right]

,\quad

\mathbf{B} =

\left[

\begin{array} {c | c}

\mathbf{B}_{11} & \mathbf{B}_{12} \\

\hline

\mathbf{B}_{21} & \mathbf{B}_{22}

\end{array}

\right]

\left[

\begin{array} {c | c c}

1 & 4 & 7 \\

\hline

2 & 5 & 8 \\

3 & 6 & 9

\end{array}

\right]

</math>

we get:

: <math>\begin{align}

\mathbf{A} \circ \mathbf{B}

={}& \left[\begin{array} {c | c}

\mathbf{A}_{11} \circ \mathbf{B} & \mathbf{A}_{12} \circ \mathbf{B} \\

\hline

\mathbf{A}_{21} \circ \mathbf{B} & \mathbf{A}_{22} \circ \mathbf{B}

\end{array}\right] \\

={} &\left[\begin{array} {c | c | c | c}

\mathbf{A}_{11} \otimes \mathbf{B}_{11} & \mathbf{A}_{11} \otimes \mathbf{B}_{12} & \mathbf{A}_{12} \otimes \mathbf{B}_{11} & \mathbf{A}_{12} \otimes \mathbf{B}_{12} \\

\hline

\mathbf{A}_{11} \otimes \mathbf{B}_{21} & \mathbf{A}_{11} \otimes \mathbf{B}_{22} & \mathbf{A}_{12} \otimes \mathbf{B}_{21} & \mathbf{A}_{12} \otimes \mathbf{B}_{22} \\

\hline

\mathbf{A}_{21} \otimes \mathbf{B}_{11} & \mathbf{A}_{21} \otimes \mathbf{B}_{12} & \mathbf{A}_{22} \otimes \mathbf{B}_{11} & \mathbf{A}_{22} \otimes \mathbf{B}_{12} \\

\hline

\mathbf{A}_{21} \otimes \mathbf{B}_{21} & \mathbf{A}_{21} \otimes \mathbf{B}_{22} & \mathbf{A}_{22} \otimes \mathbf{B}_{21} & \mathbf{A}_{22} \otimes \mathbf{B}_{22}

\end{array}\right] \\

={} &\left[\begin{array} {c c | c c c c | c | c c}

1 & 2 & 4 & 7 & 8 & 14 & 3 & 12 & 21 \\

4 & 5 & 16 & 28 & 20 & 35 & 6 & 24 & 42 \\

\hline

2 & 4 & 5 & 8 & 10 & 16 & 6 & 15 & 24 \\

3 & 6 & 6 & 9 & 12 & 18 & 9 & 18 & 27 \\

8 & 10 & 20 & 32 & 25 & 40 & 12 & 30 & 48 \\

12 & 15 & 24 & 36 & 30 & 45 & 18 & 36 & 54 \\

\hline

7 & 8 & 28 & 49 & 32 & 56 & 9 & 36 & 63 \\

\hline

14 & 16 & 35 & 56 & 40 & 64 & 18 & 45 & 72 \\

21 & 24 & 42 & 63 & 48 & 72 & 27 & 54 & 81

\end{array}\right].

\end{align}</math>

Khatri–Rao product

Block Kronecker product
Column-wise Khatri–Rao product

Face-splitting product

Mixed-products properties

: <math>\mathbf{A} \otimes (\mathbf{B}\bull \mathbf{C}) = (\mathbf{A}\otimes \mathbf{B}) \bull \mathbf{C} ,</math>

where <math>\bull</math> denotes the Face-splitting product.

: <math>(\mathbf{A} \bull \mathbf{B})(\mathbf{C} \otimes \mathbf{D}) = (\mathbf{A}\mathbf{C}) \bull (\mathbf{B} \mathbf{D}) ,</math>

Similarly:

: <math>(\mathbf{A} \bull \mathbf{L})(\mathbf{B} \otimes \mathbf{M}) \cdots (\mathbf{C} \otimes \mathbf{S}) = (\mathbf{A}\mathbf{B} \cdots \mathbf{C}) \bull (\mathbf{L}\mathbf{M} \cdots \mathbf{S}) ,</math>

: <math>\mathbf{c}^\textsf{T} \bull \mathbf{d}^\textsf{T} = \mathbf{c}^\textsf{T} \otimes \mathbf{d}^\textsf{T} , </math>

where <math>\mathbf c</math> and <math>\mathbf d</math> are vectors,

: <math>(\mathbf{A} \bull \mathbf{B})(\mathbf{c} \otimes \mathbf{d}) = (\mathbf{A}\mathbf{c}) \circ (\mathbf{B}\mathbf{d}) ,</math>

where <math>\mathbf c</math> and <math>\mathbf d</math> are vectors, and <math>\circ</math> denotes the Hadamard product.

Similarly:

: <math>(\mathbf{A} \bull \mathbf{B})(\mathbf{M}\mathbf{N}\mathbf{c} \otimes \mathbf{Q}\mathbf{P}\mathbf{d}) = (\mathbf{A}\mathbf{M}\mathbf{N}\mathbf{c}) \circ (\mathbf{B}\mathbf{Q}\mathbf{P}\mathbf{d}),</math>

: <math>\mathcal F(C^{(1)}x \star C^{(2)}y) = (\mathcal F C^{(1)} \bull \mathcal F C^{(2)})(x \otimes y)= \mathcal F C^{(1)}x \circ \mathcal F C^{(2)}y, </math>

where <math>\star</math> is vector convolution and <math>\mathcal F</math> is the Fourier transform matrix (this result is an evolving of count sketch properties),

Definition

Examples

Properties

Relations to other matrix operations

Abstract properties

Matrix equations

Applications

Related matrix operations

Tracy–Singh product

Khatri–Rao product

Face-splitting product