Tech Notes » TIFF File Format Primer | |
e-Discovery Software
A TIFF File Format PrimerThere are 5 popular TIFF compression types:TIFF Group 3 – originally introduced back in the early 1980’s. Popularized as the standard format for sending FAX images though fax cards, Encodes images using Huffman G3 compression, as defined by CCITT and ITU fax standards, where each page is separately defined in the TIFF file. Offers very good compression of text, and good compression of bi-tonal images. Used extensively as a storage format for scanned black and white documents. Has the additional advantage that if one bit changes in transmission, you only lose the rest of the line, not the whole image. TIFF Group 4 – popularized as the TIFF fax file format for next generation G4 fax cards and G4 fax machines. First caught on in Japan. Provides extremely good compression of text, but not so successful at compressing bi-tonal images. (Compressed file sizes can end up larger than Group 3 files). Used extensively as a storage format for scanned black and white documents. Used by PDF files as the compression method for bi-tonal images. If one bit changes in transmission or storage, you lose the rest of the image. TIFF Packbits or RLE – original TIFF file compression format for bi-tonal and color images. Similar compression results as PCX files. 24 bit color image compression is lossless, but images are very large. TIFF JPEG – multiple 24 bit compressed images in one file. Image sizes are optimally compressed using the JPEG compression algorithm, but can still be very large. Image size can be reduced by increasing the ‘lossiness’ of the image. As image size is reduced, scaling artifacts start to appear when the image is re-expanded. Every time you cycle through save / load / save, the quality deteriorates. TIFF LZW – ideal compression for 256 color images. Recently was subject to a patent by Unisys (now expired). Similar compression to PNG 8 bit Color. Advantages of using TIFF as a standard document storage type:
·
Multiple vendors and multiple platform image viewer support - some of
which support Annotation extensions. Disadvantages of using TIFF as a document storage format:
Other Electronic Document Storage File formats:PDF – Excellent compression, and display attributes. May require licensing patents from Adobe in
order to use commercially (very complex
Adobe proprietary file format). The file format is owned by Adobe, and does
change every year. As the format
requires considerable processing power to display and convert, Writing encoders
and decoders for all display platforms can be difficult to impossible. DOC – Microsoft Word used to be the standard for all file
storage and distribution. File format
changes every year or so, is not easily transportable to other operating
systems. May contain viruses. Can be quite large. Cannot be easily locked. HTML – moving standard, likely to change radically over the
next few years. Wide variety of
commercial viewers. Can be easily
modified. Requires multiple files, and can be difficult to ‘send’ or ‘store’
without loosing pieces of the document.
Difficult to retain formatting across different devices. XML – rapidly changing standard. Requires images within documents to still be stored in one of the
above file formats in order to be transferable. EPS – encapsulated postscript. Can be created by doing a ‘print to file’ from any application,
including MAC and UNIX operating systems.
Can also be easily converted to PDF using the Acrobat Distiller. File formats tend to be large, and
commercial viewers are limited. (Ghost-script). Writing encoders and decoders for all display platforms can be
difficult to impossible. PCL – Standard output file format used by Hewlett Packard
printers. Allows text to be parsed ‘in
place’ – useful for forms generation.
Difficult to keep up with later releases (PCL 6). Proprietary file format. Limited number of commercial viewers. Creating encoders and decoders can be
expensive. BMP – very large, un-compressed. Does not support multiple pages per file. PCX – does not support multiple pages per file. Is not an industry standard. Not extensible. DCX – Intel file format for sending FAX documents. Compression not as good as TIFF. Limited viewer support. Not extensible. Black and White support only GIF – Popularized as a download format for pictures by
Compuserve, and later by all internet browsers. Uses the LZW compression
algorithm – portions patented by Unisys, expired 2003. Does not support multiple pages. Does not do a good job compressing 24 bit
images. (JPEG is much better). PNG – Portable Network Graphics File. Un-patented version of
LZW compression. Supported by most web
browsers, comparable to GIF in capabilities and compression ability. Still early in its development and
acceptance as a standard file format.
Supports 1 bit, 8 bit and 24 bit images. Does not support multiple pages per file. JBIG – an alternate compression standard for bi-tonal
images that attempts to do better than TIFF G4 compression. Many patents, most of them held by IBM. TXT – ASCII or Unicode text. Very easy to search. Does not support graphics, or complex
formatting information. DjVu – extremely good compression of color images
(comparable to JBIG). Developed by
AT&T, and licensed by LizardTech.
Most documents compress to smaller than the equivalent PDF. Uses fractal compression. Used extensively
for maps, catalogues, and large color documents and books. Free downloadable browser plug-in. Not an industry standard format. Proprietary. |
|