alt=|right|thumb|Sample of a PDF417 symbol

PDF417 is a stacked linear barcode format used in a variety of applications such as transport, identification cards, and inventory management. "PDF" stands for Portable Data File, while "417" signifies that each pattern in the code consists of 4 bars and spaces in a pattern that is 17 units (modules) long.

The PDF417 symbology was invented by Dr. Ynjiun P. Wang at Symbol Technologies in 1991. It is defined in ISO 15438.

Design

thumb|components of a PDF417 barcode

<!-- standard uses codeword for the number and symbol character for the pattern -->

The PDF417 bar code (also called a symbol) consists of 3 to 90 rows, each of which is like a small linear bar code. Each row has:

  • A quiet zone. This is a mandated minimum amount of white space before the bar code begins.
  • A start pattern which identifies the format as PDF417.
  • A "row left" codeword containing information about the row (such as the row number and error correction level).
  • 1–30 data codewords: Codewords are a group of bars and spaces representing one or more numbers, letters, or other symbols.
  • A "row right" codeword with more information about the row.
  • A stop pattern.
  • Another quiet zone.

All rows are the same width; each row has the same number of codewords.

Codewords

PDF417 uses a base 929 encoding. Each codeword represents a number from 0 to 928.

The codewords are represented by patterns of dark (bar) and light (space) regions. Each of these patterns contains four bars and four spaces (where the 4 in the name comes from). The total width is 17 times the width of the narrowest allowed vertical bar (the X dimension); this is where the 17 in the name comes from. Each pattern starts with a bar and ends with a space.

The row height must be at least 3 times the minimum width: Y &ge; 3 X. Those linear scans need the left and right columns with the start and stop code words. Additionally, the scan needs to know what row it is scanning, so each row of the symbol must also encode its row number. Furthermore, the reader's line scan won't scan just a row; it will typically start scanning one row, but then cross over to a neighbor and possibly continuing on to cross successive rows. In order to minimize the effect of these crossings, the PDF417 modules are tall and narrow &mdash; the height is typically three times the width. Also, each code word must indicate which row it belongs to so crossovers, when they occur, can be detected. The code words are also designed to be delta-decodable, so some code words are redundant. Each PDF data code word represents about 10 bits of information (log<sub>2</sub>(900)&nbsp;&asymp;&nbsp;9.8), but the printed code word (character) is 17 modules wide. Including a height of 3 modules, a PDF417 code word takes 51 square modules to represent 10 bits. That area does not count other overhead such as the start, stop, row, format, and ECC information.

Other 2D codes, such as DataMatrix and QR, are decoded with image sensors instead of uncoordinated linear scans. Those codes still need recognition and alignment patterns, but they do not need to be as prominent. An 8 bit code word will take 8 square modules (ignoring recognition, alignment, format, and ECC information).

In practice, a PDF417 symbol takes about four times the area of a DataMatrix or QR Code.

Features

In addition to features typical of two dimensional bar codes, PDF417's capabilities include:

  • Linking. PDF417 symbols can link to other symbols which are scanned in sequence allowing even more data to be stored.
  • User-specified dimensions. The user can decide how wide the narrowest vertical bar (X dimension) is, and how tall the rows are (Y dimension).
  • Public domain format. Anyone can implement systems using this format without any license.

The introduction of the ISO/IEC document states: