Tech Notes » TIFF File Format Primer

A TIFF File Format Primer

There are 5 popular TIFF compression types:

TIFF Group 3 – originally introduced back in the early 1980’s. Popularized as the standard format for sending FAX images though fax cards, Encodes images using Huffman G3 compression, as defined by CCITT and ITU fax standards, where each page is separately defined in the TIFF file. Offers very good compression of text, and good compression of bi-tonal images. Used extensively as a storage format for scanned black and white documents. Has the additional advantage that if one bit changes in transmission, you only lose the rest of the line, not the whole image.

TIFF Group 4 – popularized as the TIFF fax file format for next generation G4 fax cards and G4 fax machines. First caught on in Japan. Provides extremely good compression of text, but not so successful at compressing bi-tonal images. (Compressed file sizes can end up larger than Group 3 files). Used extensively as a storage format for scanned black and white documents. Used by PDF files as the compression method for bi-tonal images. If one bit changes in transmission or storage, you lose the rest of the image.

TIFF Packbits or RLE – original TIFF file compression format for bi-tonal and color images. Similar compression results as PCX files. 24 bit color image compression is lossless, but images are very large.

TIFF JPEG – multiple 24 bit compressed images in one file. Image sizes are optimally compressed using the JPEG compression algorithm, but can still be very large. Image size can be reduced by increasing the ‘lossiness’ of the image. As image size is reduced, scaling artifacts start to appear when the image is re-expanded. Every time you cycle through save / load / save, the quality deteriorates.

TIFF LZW – ideal compression for 256 color images. Recently was subject to a patent by Unisys (now expired). Similar compression to PNG 8 bit Color.

Advantages of using TIFF as a standard document storage type:

  • Virus free compact file format for B&W and color images.
  • Universally supported Raster based file format unlikely to change for the next 100 years.  Well-defined industry standard file format specifications for the file format (TIFF Spec).
  • Default file format used by B&W and color scanners.
  • Format compatible with OCR systems, Document Storage Systems, COLD data storage, and Microfiche Storage systems (200 years plus).
  • Compatible with all major FAX card vendors.
  • No patents or other copyright licensing issues.
  • Format compatible with Electronic Patent Filing.
  • Legal document (very difficult and time consuming to edit and change). 
  • Support for multiple pages per document.
  • What you see is what you get (WYSIWYG).

·        Multiple vendors and multiple platform image viewer support - some of which support Annotation extensions.

Disadvantages of using TIFF as a document storage format:

  • Does not store the text contents of the image, which can be useful for document indexing and text searches.
  • No native display support in Microsoft Internet Explorer (must currently install a plug-in).
  • PDF does a better job of compressing documents if stored in vector format (text searchable).

Other Electronic Document Storage File formats:

PDF – Excellent compression, and display attributes.  May require licensing patents from Adobe in order to use commercially  (very complex Adobe proprietary file format). The file format is owned by Adobe, and does change every year.  As the format requires considerable processing power to display and convert, Writing encoders and decoders for all display platforms can be difficult to impossible.

DOC – Microsoft Word used to be the standard for all file storage and distribution.  File format changes every year or so, is not easily transportable to other operating systems.  May contain viruses.  Can be quite large.  Cannot be easily locked.

HTML – moving standard, likely to change radically over the next few years.  Wide variety of commercial viewers.  Can be easily modified. Requires multiple files, and can be difficult to ‘send’ or ‘store’ without loosing pieces of the document.  Difficult to retain formatting across different devices.

XML – rapidly changing standard.  Requires images within documents to still be stored in one of the above file formats in order to be transferable.

EPS – encapsulated postscript.  Can be created by doing a ‘print to file’ from any application, including MAC and UNIX operating systems.  Can also be easily converted to PDF using the Acrobat Distiller.  File formats tend to be large, and commercial viewers are limited. (Ghost-script).  Writing encoders and decoders for all display platforms can be difficult to impossible.

PCL – Standard output file format used by Hewlett Packard printers.  Allows text to be parsed ‘in place’ – useful for forms generation.  Difficult to keep up with later releases (PCL 6).  Proprietary file format.  Limited number of commercial viewers.  Creating encoders and decoders can be expensive.

BMP – very large, un-compressed.  Does not support multiple pages per file. 

PCX – does not support multiple pages per file.  Is not an industry standard.  Not extensible.

DCX – Intel file format for sending FAX documents.  Compression not as good as TIFF.  Limited viewer support.  Not extensible.  Black and White support only

GIF – Popularized as a download format for pictures by Compuserve, and later by all internet browsers. Uses the LZW compression algorithm – portions patented by Unisys, expired 2003.  Does not support multiple pages.  Does not do a good job compressing 24 bit images. (JPEG is much better).

PNG – Portable Network Graphics File. Un-patented version of LZW compression.  Supported by most web browsers, comparable to GIF in capabilities and compression ability.  Still early in its development and acceptance as a standard file format.  Supports 1 bit, 8 bit and 24 bit images.  Does not support multiple pages per file.

JBIG – an alternate compression standard for bi-tonal images that attempts to do better than TIFF G4 compression.  Many patents, most of them held by IBM.

TXT – ASCII or Unicode text.  Very easy to search.  Does not support graphics, or complex formatting information.

DjVu – extremely good compression of color images (comparable to JBIG).  Developed by AT&T, and licensed by LizardTech.  Most documents compress to smaller than the equivalent PDF.  Uses fractal compression. Used extensively for maps, catalogues, and large color documents and books.  Free downloadable browser plug-in.  Not an industry standard format.  Proprietary.