Image Compression

Image compression techniques can be grouped into lossless and lossy methods. Lossless methods preserve exact image data content. Compression ratios are of the order of 3:1. Lossy methods maintain an arbitrary level of image fidelity, and can produce compression ratios of 100:1 or more.

Compression symmetry is a measure of how much time and computational effort is required for encoding (compression) versus decoding (decompression). Quick image decoding (but slow encoding) is used when an image is transported to many different users, each of whom must decompress it. Quick image encoding (but slow decoding) is appropriate when many images are stored, but only rarely, if ever, viewed. This is the case in image archival applications (like storing the image of bank checks, or other paper records).

Lossless Methods

Run-Length Coding

Bitplane Coding

Image files can be divided into individual bitplanes or further processed using gray codes.

          bitplane  graycoded
 plane        size       size
   7         2,071      2,071 (same image)
   6         4,160      3,462
   5         5,993      4,952
   4         7,880      7,214
   3         8,487      7,461
   2         9,166      8,040
   1         9,709      9,172
   0         9,791      9,701
  total     57,257     52,073

original    47,948

The table shows the relative complexity of each plane, as measured by the compressed file size. However, there appears to be no advantage to encoded the bitplanes separately.

image files were encoded using gzip -9. Each image file had a 128 byte header.

Huffman Coding

Predictive Coding

The fundamental form of predictive image coding is differential pulse code modulation (DPCM). The first pixel in a line is coded exactly as it is. For the next pixel, however, the value of the preceding pixel is subtracted from the current pixel, and the difference is encoded.

73
73 8 bits
48 48-73 -25 6 bits
76 76-48 28 6 bits
56 56-76 -20 6 bits
83 83-56 27 6 bits
40 bits

32 bits

In the above example, using a 6-bit code, there could be conditions where the difference between two adjacent pixels is greater than 31 or less than -32. This is called a code overload condition. Commonly a code (such as -32) is reserved to indicate the overload case. Then the following 8 bits contain the actual pixel value instead of the differential value.

Pixel Block (or String) Coding (LZW)

Lempel-Ziv-Welch is the basis of zip and compress schemes.

GZIP.EXE 1.2.4 (18 Aug 93)

usage: GZIP.EXE [-acdfhlLnNtvV19] [-S suffix] [file ...]
 -a --ascii       ascii text; convert end-of-lines using local conventions
 -c --stdout      write on standard output, keep original files unchanged
 -d --decompress  decompress
 -f --force       force overwrite of output file and compress links
 -h --help        give this help
 -l --list        list compressed file contents
 -L --license     display software license
 -n --no-name     do not save or restore the original name and time stamp
 -N --name        save or restore the original name and time stamp
 -q --quiet       suppress all warnings
 -S .suf  --suffix .suf     use suffix .suf on compressed files
 -t --test        test compressed file integrity
 -v --verbose     verbose mode
 -V --version     display version number
 -1 --fast        compress faster
 -9 --best        compress better
 file...          files to (de)compress. If none given, use standard input.

Maintained by John Loomis, last updated July 30, 1997