JPEG. Compression algorithm

Hello again! I found this article written back in May 2019. This is a continuation of a series of articles about WAVE and JPEG, here first. This publication will include information about the image encoding algorithm and the format itself in general.

A pinch of history

A tablespoon of Wikipedia article:

JPEG (Joint Photographic Experts Group) is one of the popular raster graphics formats used for storing photographic images and similar images.

This standard was developed by the Joint Photographic Experts Group back in 1991 for efficient image compression.

Which way do images go from raw to JPEG

Some people think JPEG images are Huffman-compressed raw data, but that's not the case. Data goes a long way before control compression.

First, the color model is changed from RGB to YCbCr. There is even a special algorithm for this - here. Y is not touched, as it is responsible for the brightness, and its change will be noticeable.

The first thing to do with an image is "thinning" (subsampling). It's easy to understand: a 2x2 array of pixels is taken, then Cb and Cr are taken - the average values ​​of each of the YCbCr components of these 4 pixels. So, we won 6 bytes, instead of 4 Y, 4 Cb, 4 Cr we got 4 Y and the same Cb and Cr for each of them (4 + 4 + 4 = 12; 4 + 1 + 1 = 6; 12 - 6 = 6). At even 2x2 scales, lossy compression with a compression ratio of 2:1 sounds solid. This applies to the entire image. And so - dropped half the size. And we can use this technique thanks to our color perception. A person will easily notice a difference in brightness, but not in color, if it is averaged in a small block of pixels. Also thinning can be performed in a line, 4 pixels horizontally and vertically. The first option is used more often. If image quality is important, then decimation is not performed at all.
A visual illustration of thinning (Habr did not allow to insert a gif) - https://i.ibb.co/Rg5Th9H/150953010617579181.gif

The main part of the preparation

DCM

Now the most difficult and necessary part. The whole picture is divided into 8Γ—8 blocks (filling is used if the resolution is not a multiple of the block side).

Now apply to each block DCT (Discrete Cosine Transform). In this part, everything superfluous is taken out of the picture. Using DCT it is necessary to understand whether the given block (8Γ—8) describes any monotonous part of the image: sky, walls; or it contains a complex structure (hair, symbols, etc.). It is logical that 64 pixels similar in color can be described by just 1, because block size is already known. Here's the compression: 64 to 1.

DCT turns the block into a spectrum, and where the readings change abruptly, the coefficient becomes positive, and the sharper the transition, the higher the output. Where the coefficient is higher, the picture shows clear transitions in color and brightness, where it is lower - weak (smooth) changes in the values ​​of the YCbCr components in the block.

Quantization

Compression settings are already applied here. Each of the coefficients in each of the 8Γ—8 matrices is divisible by a certain number. If you will not reduce the image quality after all its modifications, then the divisor should be one. If the memory occupied by this photo is more important to you, then the divisor will be greater than 1, and the quotient will be rounded off. So it turns out that after rounding, you often get a lot of zeros.

Quantization is done to create the possibility of even greater compression. Here's how it looks on the example of quantizing the graph y = sin(x):

JPEG. Compression algorithm

Compression

First, we go through the matrix in a zigzag:

JPEG. Compression algorithm

We get a one-dimensional array with numbers. We see that there are many zeros in it, they can be removed. To do this, instead of a sequence of a set of zeros, we enter 1 zero and after it a number indicating their number in the sequence. This way you can reset to 1/3 of the size of the entire array. And then it simply compresses this array using the Huffman method and writes it into the file itself.

Where to use

Everywhere. Like PNG, JPEG is used in cameras, OS's (as company logos, application icons, thumbnails) and in all possible areas where images need to be stored efficiently.

Hack and predictor Aviator

At the moment, knowledge about JPEG is now only valuable for educational purposes, because it is already built into and optimized by large groups of people everywhere, but the granite of science is still delicious.

Sources of

Article about YCbCr on Wikipedia
Wikipedia article on JPEG
A little about the DCT from the post Peekaboo
Article on PrEP on Wikipedia

Source: habr.com

Add a comment