Tech Notes » TIFF File Format Specifications | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Electronic Discovery Software
TIFF (Tagged Image File Format) is a general-purpose, tag-based file format for storing and interchanging raster images. A TIFF file consists of a header, one or more Image File Directories (IFD) and the image data. More than one page can be contained in a file; there is one IFD and one image data block for each image. The header identifies the file as a TIFF file, specifies the byte-order of the data and the location of the first IFD. The IFD contains the number of image tag entries, the tag entries themselves, and the location of the next IFD, if any. The IFD stores information about the image data in tag entries. The tag entry contains the following fields: a TIFF tag that identifies the field, the field type, and the field value. The format of the header is as follows: typedef struct { For Motorola formats, all Short and Long values stored in the tags are byte reversed from the Intel format. In the Motorola format, numbers are stored in "big-endian" fashion, with the most-significant byte earlier in the file, similar to the way Motorola processors store information in memory. In the Intel-format, numbers are stored in "little-endian" fashion, with the least-significant byte earlier in the file. Most readers can handle either format. The contents of the image do not change. The IFD contains a number of tags. The tag format is as follows: typedef struct tiff_tag { The length value in the tag indicates the number of values in the tag. If all of the values specified in the tag will fit into 4 bytes total, for instance if two WORD values or one LONG value are defined, they are stored in the value element of the structure. Otherwise, the value element of the structure contains the offset in the file where these values may be found. The following tags are defined for class F TIFF files:
TIFF files intended for FAX transmission normally are normally written Bits reversed, Byte Align EOL, Group 3 format (either 2 or 3). Page numbers may or may not be present. If present, they start at 0. The Resolution unit is always Inches. There may be only one strip per page, or strips may be limited to about 8kb per strip. Lines are sometimes padded (to minimum line lengths), resulting in extra garbage at the end of some lines (sequences of binary 0's) before the terminating EOL. The start of the first line in a strip must contain an EOL. Some formats contain 6 ending EOL's at the end of the page - 6 terminating EOL's indicate page end when transmitting, but have no use in a Tiff file. There is currently no support for embedding annotations in a TIFF file. Compression formats:
Group 3 (MH) - each line consists of token strings representing alternate run-lengths of white and black. Each line is terminated with an EOL (defined as eleven 0 bits followed by one 1 bit). If byte aligned, then the EOL ends on a byte boundary. If there is a transmission error, then the viewer skips to the next valid EOL to find the start of next line. Group 3 2D (MR) - After a MH line, there are K differences lines. For standard resolution, K=1, for fine resolution, K=3. Each line starts with a bit indicating what type of line it is: a MH line, or a Differences Line. If there is a transmission error, then the viewer skips to the next valid EOL. Errors will not be fully corrected until the next MH line is encountered. Group 4 (MMR) - All lines are differences from the line above. The starting line is assumed to be a white line. There are no EOL's, lines are not byte aligned. If there is an error anywhere in the page, all subsequent lines are invalid (up until such time as an 'all White', or 'all Black' line is encountered). The full TIFF specification (version 6, 1992) is available from Adobe. You will need Acrobat Reader (available here) to view it; the specification file is located here. ![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||