JPEG - WikiHQ

{Q_{j,k \right) \mbox{ for } j=0,1,2,\ldots,7; k=0,1,2,\ldots,7</math>

where <math>G</math> is the unquantized DCT coefficients; <math>Q</math> is the quantization matrix above; and <math>B</math> is the quantized DCT coefficients.

Using this quantization matrix with the DCT coefficient matrix from above results in:

frame|Left: a final image is built up from a series of basis functions. Right: each of the DCT basis functions that comprise the image, and the corresponding weighting coefficient. Middle: the basis function, after multiplication by the coefficient: this component is added to the final image. For clarity, the 8×8 macroblock in this example is magnified by 10x using bilinear interpolation.

:<math>B=

\left[

\begin{array}{rrrrrrrr}

-26 & -3 & -6 & 2 & 2 & -1 & 0 & 0 \\

0 & -2 & -4 & 1 & 1 & 0 & 0 & 0 \\

-3 & 1 & 5 & -1 & -1 & 0 & 0 & 0 \\

-3 & 1 & 2 & -1 & 0 & 0 & 0 & 0 \\

1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\

0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\

0 & 0 & 0 & 0 & 0 & 0 & 0 & 0

\end{array}

\right].

</math>

For example, using −415 (the DC coefficient) and rounding to the nearest integer

:<math>

\mathrm{round}

\left(

\frac{-415.37}{16}

\right)

\mathrm{round}

\left(

-25.96

\right)

-26.

</math>

Notice that most of the higher-frequency elements of the sub-block (i.e., those with an x or y spatial frequency greater than 4) are quantized into zero values.

Entropy coding

thumb|right|Zigzag ordering of JPEG image components

Entropy coding is a special form of lossless data compression. It involves arranging the image components in a "zigzag" order employing run-length encoding (RLE) algorithm that groups similar frequencies together, inserting length coding zeros, and then using Huffman coding on what is left.

The JPEG standard also allows, but does not require, decoders to support the use of arithmetic coding, which is mathematically superior to Huffman coding. However, this feature has rarely been used, as it was historically covered by patents requiring royalty-bearing licenses, and because it is slower to encode and decode compared to Huffman coding. Arithmetic coding typically makes files about 5–7% smaller.

The previous quantized DC coefficient is used to predict the current quantized DC coefficient. The difference between the two is encoded rather than the actual value. The encoding of the 63 quantized AC coefficients does not use such prediction differencing.

The zigzag sequence for the above quantized coefficients are shown below. (The format shown is just for ease of understanding/viewing.)

:{| style="text-align: right"

|style="width: 2em"| −26 || style="width: 2em"| || style="width: 2em"| || style="width: 2em"| || style="width: 2em"| || style="width: 2em"| || style="width: 2em"| || style="width: 2em"|

| −3 || 0

| −3 || −2 || −6

| 2 || −4 || 1 || −3

| 1 || 1 || 5 || 1 || 2

| −1 || 1 || −1 || 2 || 0 || 0

| 0 || 0 || 0 || −1 || −1 || 0 || 0

| 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0

| 0 || 0 || 0 || 0 || 0 || 0 || 0

| 0 || 0 || 0 || 0 || 0 || 0

| 0 || 0 || 0 || 0 || 0

| 0 || 0 || 0 || 0

| 0 || 0 || 0

| 0 || 0

| 0

If the i-th block is represented by <math>B_i</math> and positions within each block are represented by <math>(p,q)</math> where <math>p = 0, 1, ..., 7</math> and <math>q = 0, 1, ..., 7</math>, then any coefficient in the DCT image can be represented as <math>B_i (p,q)</math>. Thus, in the above scheme, the order of encoding pixels (for the -th block) is <math>B_i (0,0)</math>, <math>B_i (0,1)</math>, <math>B_i (1,0)</math>, <math>B_i (2,0)</math>, <math>B_i (1,1)</math>, <math>B_i (0,2)</math>, <math>B_i (0,3)</math>, <math>B_i (1,2)</math> and so on.

thumb|upright=1.35|Baseline sequential JPEG encoding and decoding processes

This encoding mode is called baseline sequential encoding. Baseline JPEG also supports progressive encoding. While sequential encoding encodes coefficients of a single block at a time (in a zigzag manner), progressive encoding encodes similar-positioned batch of coefficients of all blocks in one go (called a scan), followed by the next batch of coefficients of all blocks, and so on. For example, if the image is divided into N 8×8 blocks <math>B_0,B_1,B_2,...,B_{n-1}</math>, then a 3-scan progressive encoding encodes DC component, <math>B_i (0,0)</math> for all blocks, i.e., for all <math>i = 0, 1, 2, ..., N-1</math>, in first scan. This is followed by the second scan which encoding a few more components (assuming four more components, they are <math>B_i (0,1)</math> to <math>B_i (1,1)</math>, still in a zigzag manner) coefficients of all blocks (so the sequence is: <math>B_0 (0,1),B_0 (1,0),B_0 (2,0),B_0 (1,1),B_1 (0,1),B_1 (1,0),...,B_N (2,0),B_N (1,1)</math>), followed by all the remained coefficients of all blocks in the last scan.

Once all similar-positioned coefficients have been encoded, the next position to be encoded is the one occurring next in the zigzag traversal as indicated in the figure above. It has been found that baseline progressive JPEG encoding usually gives better compression as compared to baseline sequential JPEG due to the ability to use different Huffman tables (see below) tailored for different frequencies on each "scan" or "pass" (which includes similar-positioned coefficients), though the difference is not too large.

In the rest of the article, it is assumed that the coefficient pattern generated is due to sequential mode.

In order to encode the above generated coefficient pattern, JPEG uses Huffman encoding. The JPEG standard provides general-purpose Huffman tables; encoders may also choose to generate Huffman tables optimized for the actual frequency distributions in images being encoded.

The process of encoding the zig-zag quantized data begins with a run-length encoding explained below, where:

is the non-zero, quantized AC coefficient.
RUNLENGTH is the number of zeroes that came before this non-zero AC coefficient.
SIZE is the number of bits required to represent .
AMPLITUDE is the bit-representation of .

The run-length encoding works by examining each non-zero AC coefficient and determining how many zeroes came before the previous AC coefficient. With this information, two symbols are created:

:{| style="text-align: center" class="wikitable"

! Symbol 1 || Symbol 2

| (RUNLENGTH, SIZE) || (AMPLITUDE)

Both RUNLENGTH and SIZE rest on the same byte, meaning that each only contains four bits of information. The higher bits deal with the number of zeroes, while the lower bits denote the number of bits necessary to encode the value of .

This has the immediate implication of Symbol 1 being only able store information regarding the first 15 zeroes preceding the non-zero AC coefficient. However, JPEG defines two special Huffman code words. One is for ending the sequence prematurely when the remaining coefficients are zero (called "End-of-Block" or "EOB"), and another when the run of zeroes goes beyond 15 before reaching a non-zero AC coefficient. In such a case where 16 zeroes are encountered before a given non-zero AC coefficient, Symbol 1 is encoded "specially" as: (15, 0)(0).

The overall process continues until "EOB" denoted by (0, 0) is reached.

With this in mind, the sequence from earlier becomes:

:(0, 2)(-3);(1, 2)(-3);(0, 2)(-2);(0, 3)(-6);(0, 2)(2);(0, 3)(-4);(0, 1)(1);(0, 2)(-3);(0, 1)(1);(0, 1)(1);

:(0, 3)(5);(0, 1)(1);(0, 2)(2);(0, 1)(-1);(0, 1)(1);(0, 1)(-1);(0, 2)(2);(5, 1)(-1);(0, 1)(-1);(0, 0);

(The first value in the matrix, −26, is the DC coefficient; it is not encoded the same way. See above.)

From here, frequency calculations are made based on occurrences of the coefficients. In our example block, most of the quantized coefficients are small numbers that are not preceded immediately by a zero coefficient. These more-frequent cases will be represented by shorter code words.

Compression ratio and artifacts

256px|thumb|This image shows the pixels that are different between a non-compressed image and the same image JPEG compressed with a quality setting of 50. Darker means a larger difference. Note especially the changes occurring near sharp edges and having a block-like shape.

256px|thumb|The original image

thumb|192px|The compressed 8×8 squares are visible in the scaled-up picture, together with other visual artifacts of the [[lossy compression.]]

The resulting compression ratio can be varied according to need by being more or less aggressive in the divisors used in the quantization phase. Ten to one compression usually results in an image that cannot be distinguished by eye from the original. A compression ratio of 100:1 is usually possible, but will look distinctly artifacted compared to the original. The appropriate level of compression depends on the use to which the image will be put.