Skip to Main Content

Digitization Services

Topic Covered

The digitization of materials in digital collections must follow appropriate digitization standards. This page offers an introduction to digitization, provides links to resources containing additional information, and describes the recommended digitization parameters for digitizing materials.

Digital Collections can contain a variety of digital media including text, audio, still images and video. C.L. Wilson Library as adopted minimal guidelines for imaging and audio digitization specifications. Adherence to these guidelines ensures the quality, consistency, and longevity of these valuable resources.

Audio

Audio Digitization
Guidelines for Audio Presentation
Guidelines for Audio Preservation

Text

Text Digitization
Guidelines for Text Images
Guidelines for Creating Accessible PDF's

Images

Image Digitization
Guidelines for Slides or Negative Film
Guidelines for Images
 

Video

Video Preservation
Guidelines for Born-Digital Video Preservation
Guidelines for Video Preservation
Guidelines for Video Presentation

 

Text Digitization

Text materials include printed matter such as books, magazines, and newspapers, or handwritten or typed original manuscripts, letters, notes, or other documents. For the purposes of digitization “text” refers to any manifestation of words that have been affixed to a physical carrier, paper or otherwise.

Depending on the purpose of the collection, different approaches to digitizing text content may be used. In some cases, libraries may only be interested in the information that the text conveys, and the medium of expression is irrelevant. However, in most collections, it is desirable not only
to create a digital representation of the information within the text content itself, but also the visual aspects of the text, such as type, formatting, layout, or paper quality. Text is also often accompanied by image content such as line drawings, photographs, graphic illustrations, manuscripts, music scores, blueprints, plans, etc.

Due to this dual nature, the digitization of texts is very similar to the digitization of image content. To facilitate full-text searching or indexing of the actual text content, additional steps must be taken so that the text can be rendered machine-readable. Text materials also have a further complication in that they are often made up of many pages (as in the case of a book) or may have multiple articles on a single page (such as a newspaper). Decisions must be made as to what unit constitutes a “work”—a single page? an individual article? an entire issue or volume?—and the digitization process should be carried out accordingly.

Creating Digital Images

In the most basic approach, the physical media to which the text is affixed is scanned to create a digital image that reproduces the content of the work. While the digitized facsimile conveys all of the visual information contained in a text, a digital image does not allow the text to be indexed and searched; additional steps must be taken to provide this functionality.

Image Basics

A digital image is a two-dimensional array of small square regions known as pixels. For each pixel, the digital image file contains numeric values about color and brightness. There are three basic types of digital images:

  • bitonal (monochrome) - each pixel is either black or white – there is no gradation.
  • grayscale - each pixel contain values in the range from 0 to 255 where 0 represents black, 255 represents white, and
    values in between represent shades of gray.
  • color - each pixel contains a numeric a value representing a compination of the primay colors of Red, Green and Blue triples, where 0 indicates that none of that primary color is present in that pixel and 255 indicates a maximum amount of that primary color.

Bit-depth or color depth refers to the amount of detail that is used to make the measurements of color and brightness. (It can be thought of as the number of marks on a ruler.) A higher bit depth indicates a greater level of detail that is captured about the image. Most digital images are 8-bit, 16-bit, or 24-bit.

The size and resolution of digital image files is measured in pixels per inch (ppi, also commonly referred to as dpi—dots per inch). The higher the ppi the greater the resolution and detail that will be captured.

Scanning Basics

Due to the wide varieties of scanners and scanning software available, a comprehensive discussion of best practices for scanner operation is not possible in this guide. “The Art of Scanning” by Paul Royster provides a solid introduction to scanning and image editing techniques for text-based and image-based digital collections.

Scanners generally offer three different modes of image capture, which correspond to the three types of digital images: black-and-white, grayscale, and color:

  • Black-and-White (aka bitonal or monochrome) - One bit per pixel representing black or white. This mode is best suited to high-contrast documents such as printed black-andwhite text, line art, or illustrations.
  • Grayscale - Multiple bits per pixel representing shades of gray. Grayscale is best suited to older documents with poor legibility or diffuse characters (e.g. carbon copies, Thermofax/Verifax, etc.), handwritten documents, items with low inherent contrast between the text and background, stained or faded materials, and works with halftone illustrations or photographs accompanying the text.
  • Color - Multiple bits per pixel representing color. Color scanning is best suited to materials containing color information, such as an illuminated manuscript or other documents where the color and texture of the paper is an important part of the work.

Scanning in color will produce the largest file sizes (in terms of bytes), grayscale the second largest, and bitonal the smallest. Libraries should choose the mode that best suits the material. If there is no advantage to scanning in grayscale or color, then bitonal mode is acceptable assuming
there is no significant loss of information. Master copies can also be created in color or grayscale and then converted to bitonal for access images.

Creating Images

For each object or page being scanned or photographed, a high-resolution master or archival file should be created. From that master file, lower-resolution derivative files will be created that are better suited to be delivered and viewed online or compiled into a file containing all the pages of
a work.

The chart below describes the differences between master images and two types of derivative files: an access image and a thumbnail image.

Master Image  Access Image Thumbnail Image
  • Represents as closely as
    possible the information
    contained in the original
  • Uncompressed, or
    lossless compression
  • Unedited
  • Serves as long term
    source for derivative files
    and print reproductions
  • Can serve as surrogate
    for the original
  • High quality
  • Large file size
  • Stored in TIFF file format
  • Used in place of master
    image for general web
    access
  • Generally fits within
    viewing area of average
    monitor
  • Reasonable file size for
    fast download time; does
    not require a fast network
    connection
  • Acceptable quality for
    general research
  • Compressed for speed of
    access
  • Usually stored in JPEG or
    JPEG2000 file format or 
    PDF when dealing with
    text
  • A very small image
    usually presented with the
    bibliographic record
  • Designed to display
    quickly online; allows user
    to determine whether they
    want to view access image
  • Usually stored in GIF or
    JPEG file formats
  • Not always suitable for
    images consisting
    primarily of text, musical
    scores, etc.; user cannot
    tell what content is at so
    small a scale

from Western States Digital Standards Group, Digital Imaging Working Group, Digital Imaging Best Practices,
http://www.mndigital.org/digitizing/standards/imaging.pdf, January
2003.

Master Images

The digital master image represents, as accurately as possible, the visual information in the original object. This image’s primary function is to serve as a long-term archival record, as well as a source for derivative files and printed materials. A high-quality master image eliminates the need to re-digitize, and therefore re-handle, the same potentially fragile physical materials again in the future. A master image should also support the production of a printed page facsimile that is a legible and faithful stand-in for the original when printed at the same size.

Some general guidelines for creating digital master files:

Specific guidelines should be developed for the size and resolution of digital master files based on individual collection needs and requirements.

When scanning text documents, the scanning resolution may need to be adjusted according to the size of text in the document. Documents with smaller printed text may require higher resolutions and bit depths than documents containing large typefaces. A higher resolution may offer increased accuracy for Optical Character Recognition (OCR) processing.

Scanned master images should not be edited for any specific output or use, and should be saved as large TIFF files with lossless or no compression.

Where possible, scanning guidelines for the creation of digital master files should follow the specifications outlined in the Federal Agencies Digitization Initiative (FADGI) - Still Image Working Group’s Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files.

Derivative Images

Derivative files are used for editing and enhancement, conversion to different formats, and presentation or transmission over networks. In the case of text works that comprise more than one page, derivative images can be compiled into a single file that represents the entire work.


Derivative images can be created using image editing applications such as Adobe Photoshop, GIMP (a freely available open-source image editing program), or Microsoft Office Picture Editor. Some applications, like Adobe Acrobat, can automatically downsize images when compiling a file made up of multiple page images, eliminating the need to create derivative copies by hand.

Access Images

Access images represent the version of the image that users viewing the digital items online will interact with. Access images should be of sufficient size and resolution to allow for detailed study, but not so large that they take too long to load in the browser. Access images may also be edited to improve the viewing experience for the user, through such processes as cropping, straightening, color correction, sharpening, or descreening. These edits can be made using image editing software such as Adobe Photoshop or GIMP.

Thumbnail Images

Thumbnail images are small, low-resolution versions of the content—usually displayed in the search results view of online digital collections—that give the user a preview of the larger image. Some digital asset management systems will automatically generate a thumbnail image for each
item loaded into the software.

Machine-Readable Text

Machine-readable text results from either a scanning and conversion process (OCR) performed on textual materials or from manually transcribing or re-keying text with word processing software to produce some form of machine-readable text file that can be indexed and searched, offering users better access to the intellectual content of the work.

Text-based materials may be handled in various ways. Methods will depend on factors such as library resources, quality of the original materials, software requirements, and end user needs.

Digital Text Basics

Digital representations of text are based on the concept of character encoding, which is the assignment of a numeric code for each character in a given repertoire to a sequence of bit patterns in order to facilitate the transmission and storage of text in digital form. The character encoding used in a file will determine the type of characters that can be represented in the file. Currently, 8-bit Unicode Transformation Format (UTF-8), is the generally accepted standard for digital texts. UTF-8 encoding can accommodate not only Latin-based language characters, but also Greek, Cyrillic, Hebrew, Arabic, and much more. For these reasons, it is recommended that all textual documents be encoded as UTF-8.

Most computer programs can save text-based documents (plain text files, XML, or HTML) as a UTF-8 encoded document. Additionally, some document formats, such as XML and HTML, provide a way to explicitly declare the file as UTF-8 encoded within the markup, which a parser can then use to interpret the rest of the document. In XML, this can be seen easily in the first line of the file, where the type of file is declared (XML) and so is its encoding (UTF-8). Before saving a text file, check the software’s save options to make sure that UTF-8 encoding is being used.

Optical Character Recognition (OCR)

OCR is the process of electronically translating a scanned bitmapped image of text material into machine-readable text. A computer program “reads” the character content within the image and creates a digital version of the text, usually but not always in a separate file. This allows the text to be searched and indexed, or used in other processes such as data mining or machine translation.

The accuracy of the OCR process depends on a number of factors, including the quality of the image being scanned, the language that the text is written in, and the type of font used in printing. Poor quality images where the text is not clearly contrasted with the background, text in non-European foreign languages (or non-Latin character sets), and text rendered in serif fonts can all decrease the accuracy of the resulting text file. At this time, hand-printed manuscripts are extremely difficult for OCR software to interpret, and those written in cursive are basically impossible. However, with a clear typeset image, an accuracy of 80%-90% may be achieved through the use of readily available and relatively inexpensive software.

The advantage of OCR is that it eliminates the need for costly, time-consuming transcription. For most libraries transcription may not be an option, and so even an inaccurate rendering as produced by OCR is still an advantage over having no digital representation of the text at all. OCR routines can also be set up as part of the digitization workflow and do not require a significant time investment. For documents where the accuracy of the machine-readable text is of primary importance, the OCR-produced text can be manually corrected.

Some digital asset management systems offer OCR modules. Other software applications include Adobe Acrobat, ABBYY FineReader, and
OmniPage.

Transcriptions

Text that is difficult to read or that cannot be reliably OCR’d, especially handwritten manuscripts, should be considered for transcription. However, transcription presents its own problems—it can be labor intensive and cost prohibitive—so a decision needs to me made as to when the importance of providing full-text searching of the content makes the time investment worthwhile.

Text Encoding & Markup Languages

Transcribed text can also be encoded with markup languages, such as XML or XHTML, to provide a digital representation of the semantic and physical document structure. Text encoding provides a machine-readable means of denoting structural text elements such as italics, bold type, line breaks, stanzas, paragraphs, page breaks, chapters, etc. Semantic elements of the text, such as geographical locations or personal names, can also be marked.

The most widely used standard for encoding text-based cultural materials is an XML-based schema developed by the Text Encoding Initiative (TEI). The TEI Guidelines for Electronic Text Encoding and Interchange “define and document a markup language for representing the structural, renditional, and conceptual features of texts,” with a focus on primary source materials for research and analysis.

Like transcription, text encoding requires a significant investment of resources, and encoded texts require specialized systems and applications to parse, process, index, and display the content in any meaningful way.

Combining Multiple Files into a Single Digital Object

As discussed previously, text materials often consist of many pages that collectively comprise a work. Therefore, a digital facsimile of such a work must include a way to compile many separate scans and images into a single file that maintains the order and structure of the original object. A
plethora of digital formats can provide this functionality, including Adobe PDF, DjVu, and ePub.

Technical Metadata

In the interest of preservation and reproduction, it is helpful to capture technical metadata in the creation of the digital image file. Digital cameras and scanners can automatically capture this information and embed it in the object file. NISO Standard Z39.87 (Data Dictionary - Technical
Metadata for Digital Still Images) is widely accepted for use in the management of technical metadata. Among the attributes that can be described by the technical metadata are the following:

  • file format
  • file resolution (pixels per inch)
  • dimensions (image dimension or size in inches or centimeters)
  • bit-depth (e.g., 8-bit, 16-bit, 24-bit, etc.)
  • color mode (e.g., RGB, CMYK, or grayscale)
  • scanner or digital camera brand, name, and model number
  • software used to manipulate or compress the image, including the software name and version.

From Best Practices for Technical Metadata: http://www.library.illinois.edu/dcc/bestpractices/chapter_10_technicalmetadata.html

Guidelines for Text Images

Resolution Bit-Depth File Format
Master Image (Black & White)
300 ppi 8-bit TIF
Master Image (Grayscale) 600 ppi 8-bit TIF
Master Image (Color or materials with finely printed text) 600 ppi 24-bit TIF
Access Image (Black & White)    72 - 200 ppi 8-bit JPEG or JPEG2000
Access Image (Grayscale) 72 - 200 ppi 8-bit JPEG or JPEG2000
Access Image (Color or materials with finely printed text) 72 - 200 ppi 24-bit JPEG or JPEG2000

Guidelines for Creating Accessible PDF's

  • Use Adobe Acrobat X Pro and combine the TIFF master files into a single file.
  • Apply OCR
  • Reduse File Size
  • Make compatible to Acrobat Reader 4.0 or later

Guidelines for Slides or Negitive Film

Pixel Array Resolution Dimensions Bit-Depth Color Mode Archival File Format
Rectangular format original
(Black and White)
3,000 pixels across the long dimension

4,000 pixels across the long dimension (preferred)
300 ppi 10 inches on the long dimension

or

exact size of original for smaller objects
8-bit Grayscale TIF
Rectangular format original
(Color)
3,000 pixels across the long dimension

4,000 pixels across the long dimension (preferred)
300 ppi 10 inches on the long dimension

or

exact size of original for smaller objects
24-bit Color TIF
Square format original
(Black and White)
2,700 pixels across the long dimension 300 dpi 10 inches on the long dimension

or

exact size of original for smaller objects
8-bit Grayscale TIF
Square format original
(Color)
2,700 pixels across the long dimension 300 dpi 10 inches on the long dimension

or

exact size of original for smaller objects
24-bit Color TIF

Image Digitization

Images may include such items as photographs, maps, plans, blueprints, drawings, paintings, and other two-dimensional visual media.

In many instances, images will contain or have accompanying textual material. Due to this dual nature, the digitization of images is very similar to the digitization of text.

Digital Images

A digital image is a two-dimensional array of small square regions known as pixels. For each pixel, the digital image file contains numeric values about color and brightness. There are three basic types of digital images:

  • bitonal (monochrome) - each pixel is either black or white – there is no gradation.
  • grayscale - each pixel contain values in the range from 0 to 255 where 0 represents black, 255 represents white, and
    values in between represent shades of gray.
  • color - each pixel contains a numeric a value representing a compination of the primay colors of Red, Green and Blue triples, where 0 indicates that none of that primary color is present in that pixel and 255 indicates a maximum amount of that primary color.

Bit-depth or color depth refers to the amount of detail that is used to make the measurements of color and brightness. (It can be thought of as the number of marks on a ruler.) A higher bit depth indicates a greater level of detail that is captured about the image. Most digital images are 8-bit, 16-bit, or 24-bit.

The size and resolution of digital image files is measured in pixels per inch (ppi, also commonly referred to as dpi—dots per inch). The higher the ppi the greater the resolution and detail that will be captured.

Scanning Basics

Due to the wide varieties of scanners and scanning software available, a comprehensive discussion of best practices for scanner operation is not possible in this guide. “The Art of Scanning” by Paul Royster provides a solid introduction to scanning and image editing techniques for text-based and image-based digital collections.

Scanners generally offer three different modes of image capture, which correspond to the three types of digital images: black-and-white, grayscale, and color:

  • Black-and-White (aka bitonal or monochrome) - One bit per pixel representing black or white. This mode is best suited to high-contrast documents such as printed black-andwhite text, line art, or illustrations.
  • Grayscale - Multiple bits per pixel representing shades of gray. Grayscale is best suited to older documents with poor legibility or diffuse characters (e.g. carbon copies, Thermofax/Verifax, etc.), handwritten documents, items with low inherent contrast between the text and background, stained or faded materials, and works with halftone illustrations or photographs accompanying the text.
  • Color - Multiple bits per pixel representing color. Color scanning is best suited to materials containing color information, such as an illuminated manuscript or other documents where the color and texture of the paper is an important part of the work.

Scanning in color will produce the largest file sizes (in terms of bytes), grayscale the second largest, and bitonal the smallest. Libraries should choose the mode that best suits the material. If there is no advantage to scanning in grayscale or color, then bitonal mode is acceptable assuming
there is no significant loss of information. Master copies can also be created in color or grayscale and then converted to bitonal for access images.

Creating Images

For each object or page being scanned or photographed, a high-resolution master or archival file should be created. From that master file, lower-resolution derivative files will be created that are better suited to be delivered and viewed online or compiled into a file containing all the pages of
a work.

The chart below describes the differences between master images and two types of derivative files: an access image and a thumbnail image.

Master Image  Access Image Thumbnail Image
  • Represents as closely as
    possible the information
    contained in the original
  • Uncompressed, or
    lossless compression
  • Unedited
  • Serves as long term
    source for derivative files
    and print reproductions
  • Can serve as surrogate
    for the original
  • High quality
  • Large file size
  • Stored in TIFF file format
  • Used in place of master
    image for general web
    access
  • Generally fits within
    viewing area of average
    monitor
  • Reasonable file size for
    fast download time; does
    not require a fast network
    connection
  • Acceptable quality for
    general research
  • Compressed for speed of
    access
  • Usually stored in JPEG or
    JPEG2000 file format
  • A very small image
    usually presented with the
    bibliographic record
  • Designed to display
    quickly online; allows user
    to determine whether they
    want to view access image
  • Usually stored in GIF or
    JPEG file formats
  • Not always suitable for
    images consisting
    primarily of text, musical
    scores, etc.; user cannot
    tell what content is at so
    small a scale

From the Western States Digital Standards Group, Digital Imaging Working Group, Digital Imaging Best Practices, available on the Lyrasis site (http://www.lyrasis.org/), January 2003 and Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files
http://www.digitizationguidelines.gov/guidelines/FADGI_Still_Image-Tech_Guidelines_2010-08-24.pdf, August 2010

Master Images

The digital master image represents, as accurately as possible, the visual information in the original object. This image’s primary function is to serve as a long-term archival record, as well as a source for derivative files and printed materials. A high-quality master image eliminates the need to re-digitize, and therefore re-handle, the same potentially fragile physical materials again in the future. A master image should also support the production of a printed page facsimile that is a legible and faithful stand-in for the original when printed at the same size.

Some general guidelines for creating digital master files:

  • Specific guidelines should be developed for the size and resolution of digital master files based on individual collection needs and requirements.
  • Scanned master images should not be edited for any specific output or use, and should be saved as large TIFF files with lossless or no compression.
  • Where possible, scanning guidelines for the creation of digital master files should follow the specifications outlined in the Federal Agencies Digitization Initiative (FADGI) - Still Image Working Group’s Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files.

Derivative Images

Derivative files are used for editing and enhancement, conversion to different formats, and presentation or transmission over networks. In the case of text works that comprise more than one page, derivative images can be compiled into a single file that represents the entire work.


Derivative images can be created using image editing applications such as Adobe Photoshop, GIMP (a freely available open-source image editing program), or Microsoft Office Picture Editor. Some applications, like Adobe Acrobat, can automatically downsize images when compiling a file made up of multiple page images, eliminating the need to create derivative copies by hand.

Access Images

Access images represent the version of the image that users viewing the digital items online will interact with. Access images should be of sufficient size and resolution to allow for detailed study, but not so large that they take too long to load in the browser. Access images may also be edited to improve the viewing experience for the user, through such processes as cropping, straightening, color correction, sharpening, or descreening. These edits can be made using image editing software such as Adobe Photoshop or GIMP.

Thumbnail Images

Thumbnail images are small, low-resolution versions of the content—usually displayed in the search results view of online digital collections—that give the user a preview of the larger image. Some digital asset management systems will automatically generate a thumbnail image for each
item loaded into the software.

Monitor Calibration

Monitors used for image editing and color correction should be calibrated according to the following specifications:

  • Set to 24 millions of colors
  • Set monitor Gamma at 2.2
  • Color temperature at 5000 degrees K

From Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files http://www.digitizationguidelines.gov/guidelines/FADGI_Still_Image-Tech_Guidelines_2010-08-24.pdf, August 2010

Technical Metadata

In the interest of preservation and reproduction, it is helpful to capture technical metadata in the creation of the digital image file. Digital cameras and scanners can automatically capture this information and embed it in the object file. NISO Standard Z39.87 (Data Dictionary - Technical
Metadata for Digital Still Images) is widely accepted for use in the management of technical metadata. Among the attributes that can be described by the technical metadata are the following:

  • file format
  • file resolution (pixels per inch)
  • dimensions (image dimension or size in inches or centimeters)
  • bit-depth (e.g., 8-bit, 16-bit, 24-bit, etc.)
  • color mode (e.g., RGB, CMYK, or grayscale)
  • scanner or digital camera brand, name, and model number
  • software used to manipulate or compress the image, including the software name and version.

From Best Practices for Technical Metadata: http://www.library.illinois.edu/dcc/bestpractices/chapter_10_technicalmetadata.html

Guidelines for Images

Pixel Array Resolution Bit-Depth Color Mode Archival File Format
Master Image (Grayscale) 4,000 - 6,000 pixels across the long dimension 300 ppi or more if necessary to an image conforming to 4000 - 6000 pixels across the long dimension 8-bit Grayscale TIF
Master Image (Color) 4,000 - 6,000 pixels across the long dimension 300 ppi or more if necessary to an image conforming to 4000 - 6000 pixels across the long dimension 24-bit Color TIF
Access Image (Grayscale) 800-3000 pixels across the long dimension 72 – 300 ppi 8-bit Grayscale JPEG or JPEG2000
Access Image (Color)    800-3000 pixels across the long dimension 72 – 300 ppi 24-bit Color JPEG or JPEG2000
Thumbnail Image 150-200 pixels across
the long dimension
72 - ppi 24-bit Grayscale or Color GIF ofr JPEG

Audio Digitization

Digital Audio is the reproduction and transmission of sound stored in a digital format. I a digital recording of sound the sound wave is transformed to a digital representation and stored as a series of samples in a digital audio file format.For commercial audio CDs, the standard sample rate is 44.1 kHz with a 16-bit file. This is based on the fact that the highest frequency recordable is one half of the sample rate. This means a 44.1 kHz
recording will support a top frequency of 22 kHz. Since most humans cannot hear sounds above 20 kHz, the music industry adopted the 44.1 kHz sampling rate for digital audio recordings knowing that it would capture all human-audible sound.

Preservation-quality audio and audio digitized from an analog source should use a higer sample rate and file bit-depth. The current standard for archival digitization of analog recordings is 96 kHz with a 24-bit depth. Ther are several reasons for this:

  • the acurate capture of noise like clicks, pop, and other inaudible information that resides in frequencies higher the 44.1 kHz
  • the edesire to communicate inaudible harmonic information that impact perception of sound
  • the ability to record and provide content that, although not necessarily heard, helps listeners understand and hear
    better space, depth, and instrument location in stereo and surround sound recordings
  • to accommodate future user applications

Master audio files shoud be saved uncompressed and in a widely-used file format, with the maximum likelihood of continued support.

Commonly used digital audio file formats include:

  • WAV PCM (Waveform Audio File Format using Pulse-code Modulation)
  • BWF (Broadcast Wave Format)
  • AIFF (Audio Interchange File Format)
  • MP3

WAV PCM or WAV BWF are preferred for long term storage and maximum flexability. AIFF is acceptabe but because it is an Apple Computer format is is not as broadly used.

Access copies for audio files may be saved in compressed formats that allow for quicker transfer or streaming via the Internet. The MP3 format is widely-supported, is playable on nearly all handheld devices, and is commonly used for web delivery. MP3 is the preferred format. MP3 files saved at 192 Kbps (the bit-rate) are recommended for good quality compressed audio.

 

 

 

Guidelines for Audio Preservation

Sample Rate
Bit-Depth Archival File Format
Audio from commercial CD 44,000 16-bit WAV
Human voice only, no instrumental music 48,000 24-bit WAV
Oral History recording, especially if there is music. Natural sound or sounds from nature 96,000 24-bit WAV

Guidelines for Audio Presentation

Bit rate
 File Format
Oral history, without music 128 Kbps MP3
Most audio 192 Kbps MP3
Very high fidelity required 320 Kbps MP3

Video Digitization

All moving image formats (film, analog television, and digital) consist of a sequence of frames (still images) which when displayed at a constant rate and in rapid succession create the illusion of movement. An audio stream is often coupled with the image stream to provide sound with the moving images. In digital video, frames consist of bitmapped digital images, and the audio stream consists of a digital audio file.

There are seven fundamental concepts of digital video technology:

  • the size and shape of the individual images (frames)
  • the resolution at which the images are captured
  • the compression processes used to encode the information (codecs)
  • the spead at which the images follow each other at playback
  • the overall length of the program
  • the accompanying audio
  • the size of the files created


The Size and Shape of the Individual Images (frames)

Image Size or Resolution

The size of the video stream is expressed as the number of horizontal pixels (width) multiplied by the number of vertical pixels (heights). The
horizontal measurement is sometimes referred to as the number of “samples,” and the vertical measurement is sometimes called the number of “lines.” The larger the image size, the better clarity and fidelity the images will have.

An image size of 320 x 240 means that the video stream is 320 pixels/samples in width by 240 pixels/lines in height. 320 x 240 resolution is often referred to as “quarter-screen” size, while 640 x 480 is “full screen.” Standard-definition (SD) television uses 640 x 480, while high-definition (HD) formats may be in larger sizes such as 1280 x 720 or 1920 x 1080.

Aspect Ratio

This value represents the width of the video stream divided by its height. Most analog television and video material has a 4:3 aspect ratio, which is used in SD television, while 16:9 is used for HD television and European digital television. Most commercial films appearing in theatres have an aspect ratio of 1.85:1 or 2.39:1.

The Resolution at Which the Images are Captured

Sampling

For each pixel in an image within a digital video stream, three elements are recorded: a “luma” element, corresponding to the brightness level; and two “chroma” elements, corresponding to the color levels for red and blue. The more precise the measurement of these elements is, the higher the resolution will be.

To reduce the size of digital video files, some chroma subsampling is almost always used. Humans are generally more sensitive to changes in brightness than changes in color, so in principle video formats can be made more efficient by sampling the chroma elements at a lower rate.

Common sampling schemes include:

  • 4:4:4 – Luma and chroma elements are sampled at every pixel
  • 4:2:2 – Luma is sampled at every pixel; chroma at every other pixel (basically, the chroma resolution is halved)
  • 4:2:0 – Luma is sampled at every pixel; chroma at every other pixel, and red and blue values are sampled alternately on every other line
  • 4:1:1 – Luma is sampled at every pixel; chroma at every 4th pixel

4:2:2 is a commonly used format in high-end digital video equipment, while 4:2:0 is used in almost all MPEG-based formats, including DVDs. It should be noted that 4:4:4 is the only true “lossless” sampling scheme.

The type of subsampling used when the material is digitized is directly related to the type of codec used to encode the file

Video Sample Size (Bit Depth)

This value, represented in bits, indicates how detailed the measurement of the luma and chroma elements is. The higher the number, the more detailed the measurement. The value is expressed as the sum of the bit depth for the 3 sampled elements (1 luma, 2 chroma). Individual elements are usually sampled at a depth of 8 or 10 bits, meaning that most digital video files have a 24 or 30 bit sample size.

Scanning: Progressive vs. Interlaced

“Scanning” in this context refers to the means in which the frame image is captured. There are two main types of scanning used in digital video: interlaced and progressive. Each has advantages and disadvantages.

  • Interlaced: Two fields or “exposures,” each containing one-half of the image information, are captured about 1/60th of a second apart. One field consists of the odd-numbered lines, the other the even-numbered lines. Due to subject movement between the capture of the two fields, some blurring is possible when the two fields are combined in a single frame, which often occurs when transferring analog video to digital. Most existing analog video is in interlaced format.
  • Progressive: The entire image is captured in a single exposure. This allows for increased clarity, but requires more processing power to render. Most born-digital video uses progressive scanning.

During playback, progressive and interlaced video will be presented to the viewer in the same format in which they were scanned.

Compression Processes Used to Encode the Information (Codecs)

Given the need for reducing the size of digital video files for both long-term storage and web dissemination, substantial effort has gone into the creation of algorithms and programs to compress the digital information within the file at the time of creation and de-compress it at the time of playback. These programs are commonly referred to as “codecs” (compressor-decompressor). The codec used during file creation is directly related to the type of chroma subsampling (see section 3.2.1 above) used when the file is encoded. (Codecs and encoding standards are often referred to interchangeably.)

Codecs may be lossless (no information is discarded during the encoding process), but most involve lossy compression, meaning that some information is discarded at the time of encoding. Depending on the importance of the digital video file being created, the purpose of the collection, and the storage capabilities of your institution, using a lossy codec may be an acceptable practice. (In reality, the vast majority of digital video file formats and collections in existence use lossy compression.)

Lossless codecs:

  • Huffyuv – An open, fast, lossless Win32 video codec
  • JPEG2000 (Lossless) -- An image compression standard which can be both lossless or lossy, depending on how it is implemented
  • Lagarith – A more recent version of Huffyuv
  • Uncompressed YCbCr – This captures the full luma (Y) and chroma (CbCr) information for every pixel. (In reality, this is not a codec, since there is no compression.)

Lossy codecs:

  • DV – The codec used by many digital video cameras.
  • H.263 – Used to encode Flash video or other formats intended for low-bandwidth online or mobile dissemination
  • MPEG-2 Part 2 – Also known as H.262. This codec is used for all DVD video.
  • MPEG-4 – This standard has a wide range of applications, from low-end webdisseminated files to high-quality files used for digital television. There are two especially relevant encoding schemes specified by the MPEG-4 standard:
    • MPEG-4 Part 2 – this encoding has 2 different profiles: Simple Profile, for low-quality files, and Advanced Simple Profile (ASP), which forms the basis for other commonly used codecs such as DivX and Xvid.
    • MEPG-4 Part 10 – Also known as H.264 or MPEG-4 AVC. This is becoming one of the most widely used encoding schemes, due to its ability to create smaller file sizes while retaining image quality. It is used in Blu-ray, HD DVD, and digital television, and is compatible with a wide variety of media players and software platforms.
  • QuickTime H.264 – Apple’s proprietary encoding standard based on H.264.
  • VC-1 – A Microsoft-developed alternative to the H.264 codec, which has since been adopted as a standard by the Society of Motion Picture and Television Engineers (SMPTE). Also part of the Blu-ray and HD DVD standard.
  • WMV – Microsoft’s proprietary Windows Media Video codec
  • Xvid, DivX, FFmpeg, 3ivx – These codecs are all based on MPEG-4 ASP. Xvid is an open standard.  

One important thing to understand is that the codec is not included in the digital video file itself. The user’s playback software must include a codec which, if it is not the same as, must at least be compatible with the codec used to create the file. If the user does not have the proper codec installed on his/her computer, the video may not be viewable.

The Speed at Which the Images Follow Each Other During Playback

Frame Rate & Field Rate

“Frame rate” refers to the number of frames displayed per second during playback. 30 frames/second is the standard for almost all non-film video and television material, whether it is analog or digital. (Film has a rate of 24 frames/second.) “Field rate” refers to the number of fields displayed per second during playback. For progressive-scanned video, the field rate and frame rate are the same. However, for interlaced video, the field
rate is actually 2x the frame rate (because it takes two fields to make a complete frame).

Data Rate (Bit Rate)

This value is a measurement of the amount of information delivered over a given period of time. It is usually expressed in kilobytes per second (kB/s or kBps) or megabytes per second (MB/s or MBps). The larger the data rate, the higher the quality of the video stream will be, but higher data rates also require more bandwidth and processing power to render the video stream during playback.

The Overall Length of the Program

Duration

This value represents the length of time of the entire video stream. It is expressed in hours:minutes:seconds. For example a video with a length of 1 hour, 5 minutes, and 26 seconds would be written as “01:05:26”.

The Accompanying Audio

Please see the Audio portion of this document for a detaild discussion of digital audio audio principles and practices.

The Size of the Files Created

Digital video production results in much larger files than those created by the digitization of static images or audio. While server space seems to come ever more cheaply these days, the ability to store massive amounts of data will continue to be an issue for most libraries. Thus, almost all of the underlying technological concepts of digital video need to be understood in the context of the acute need for compression and reducing file sizes.

Estimated file sizes for 1 hour of SD digital video material:

  • Uncompressed: 70 -100 GB
  • Lossless compression: 25-50 GB
  • Lossy compression: 10-20 GB

Commonly used digital video file formats include:

Delivering Video on the Web

many issues can come into play when choosing how to deliver digitized video over the web. Concerns such as budget and internal technical support, scope of project, and intended audience can guide choices on download method and file player. Decisions on the formats, presentation, and method of access will be shaped by the goals of the project and the technical capabilities of both your institution and users.

For example, if the sole purpose of a project is archival, then high quality video files can be stored on media at a local level and simply retrieved manually when needed. In this case a high quality, lossless video file may be created, without worrying about file size or the technical needs of the end user. However, if that video needs to be viewed remotely, whether by a small local community or a larger public community, then access over the web can provide a better solution. A smaller, more compressed file can be created to provide quick and easy access to the content of the file over the web.

No matter what file format is chosen for the project, decisions will have to be made on how that video will be distributed over the web (either as a download or through video streaming) and what video player options will be made available.

Methods of delevery:

  • Direct Download -The simplest and quickest method for distributing video over the web is through direct download. In this instance, the video file is placed on a server and access is provided through a simple hyperlink on a website. The file is then downloaded by the user and saved to their computer. Playback will occur when the file has been completely downloaded. Any required media players will need to be installed on the user’s computer.

    One benefit of direct download is reduced set-up time and costs. Existing web server infrastructure can be used for the storing of files, and no special coding is needed to provide access, making it attractive for projects with little technical support. Once a file is downloaded, the user can chose their preferred media player for playback of the file, putting a higher onus on the user but also allowing them more freedom when viewing the files.
  • Progressive Download - One step up from direct download is progressive download. Progressive download also uses a standard web server for storage and access. Unlike direct download, progressive does not need the entire file to download before it can be played. Once the first portion of the file has been downloaded, or buffered, by the media player the file will begin to play, providing almost continuous playback. This allows for a user experience that is similar to streaming, without set-up or support requirements which are quite so advanced.

    In both cases the entire video file will eventually be downloaded to the local user’s machine, which means the file can be copied and distributed without your control (a problem if there are security or copyright concerns with the material). Also, the user cannot move around the chronological timeline of the video until the entire file has been downloaded, even with a progressive download.

  • Streaming Video - In contrast to direct or progressive downloads, streaming video allows quicker access to video content and does not require files to be downloaded onto the user’s computer. Streaming video begins playing almost immediately, providing much quicker access to content. No files are actually downloaded by the user, helping to prevent unauthorized distribution of content. While a specialized streaming video server will be needed, this added hardware cost is offset by reduced bandwidth needs, increased playback options, and greater information on the use of material.

    Streaming media (or streaming video) servers are specifically designed to be used for transmitting video or audio over the web. These servers can be set-up locally or through a hosted solution provider. A streaming server will take into account the format, size, and
    structure of a video file, as well as the capabilities of the user’s connection and media player, and feed only those portions of the file that are needed for viewing at a given time. This decreases the amount of bandwidth that needed to view the file, reduces the
    load on the server, and maximizes playback quality. Administrative functionality allows for closer examination of the frequency, time, and conditions of video viewing.

    A major benefit of streaming video is greater control over the playback of the file for the user. Because the streaming server is only distributing the portion of the file needed for viewing, the user can move forward and backward along the timeline. Streaming servers
    can also better handle larger traffic loads, and manage multiple users accessing files simultaneously. Because no file is downloaded to the user’s computer, it cannot easily be saved locally and copied without permission. While software programs that can capture
    streaming video do exist, the fact that no file exists locally adds a level of security not available with downloaded files.

File Formats and Players for Dissemination

Most standard video file formats can be used for both download and streaming purposes, and file size and quality is also fairly consistent across formats. Therefore, compatibility and consistency of playback across platforms become the major factors in deciding what format to use.

 

Popular File Formats: Player & Operating System Compatibility

File Type

Flash

(Win/Mac)   

Windos

Media

Player

(Windows)

Windows

Media

Player

(Mac)

Quicktime

(Win/Mac)

Real Player

(Win/Mac)

.fla

X

.swf

X

X

.wmv

X

X

.avi

X

X

.mov

X

.mpg

X

X

X

.mp4

X

.ram

X

Third party software programs can increase the compatibility of some file formats across platforms, but these programs often require advanced skills by users. It is best to provide an option that will reach the widest possible audience, while retaining the look and feel required by your project.

No matter what the format, video files can be presented to the user through either a direct link or an embedded player. A direct link is the simplest method, leading to a direct download or perhaps a progressive download in a new window, but offers little control over how the user will experience the video. Embedding a file places the link to the file and the player directly onto the web page, which not only allows for immediate playback of the file within the context of the site, page, or metadata, but also gives a certain amount of control on the size of the player, what controls are made available, if the file will automatically start playing, and if it will repeat on a loop. While the code for embedding a file on a page will include information needed to control the playback of the file, the player must still be installed on the user’s computer for playback to work. A
combination of these methods can be used to provide both the immediate embedded playback, as well as options to download the file in alternative formats.

For widest support across platforms, Flash is a popular choice. It comes preinstalled on most systems, the player is embedded on the page (so that playback is not dependent on the local machine), and offers more options for playback and display. File size is slightly larger than similar quality from other formats. Files must also be converted to Flash from other formats, adding a step to the process. Files are designed for online playback, rather than direct download. Also some newer smart phones and tablet computers will not play Flash video.

Flash has two basic file formats: the .flv file which is the Flash video file, and the .swf which is the web version that is presented to the end user. The .swf file serves as a container, which includes the information needed to present and play the .flv file. The .swf file is the one that will actually be linked to or embedded on your web page.

The next most compatible format is MPEG (.mpg), which can be viewed in players native to Windows and Macintosh computers, including Windows Media Player, QuickTime, and RealPlayer. The newer MPEG-4 (.mp4) format, however, is not supported on Windows Media Player.


Windows Media files (.wmv) can be played using Windows Media Player, which comes installed on all Windows computers. However, Microsoft has ceased development of the Media Player for Macintosh, so the only player available for that platform is two versions old and not actively supported. Use of this format could therefore effectively exclude a large user base.


QuickTime files (.mov) can be played using QuickTime, which is natively installed for Macintosh computers and available for download on Windows and other systems. RealMedia files (.ram) can be played using RealMedia Player, which is available across platforms.

Guidelines for Video Preservation

  • Uncompressed YCbCr or JPEG2000 lossless encoding (codec)
  • 640 x 480 resolution (assuming 4:3 original aspect ratio)
  • 30 bit sample size
  • Progressive scanning
  • 30 MBps data rate
  • MXF (.mxf) file format

Uncompressed video requires an enormous amount of storage space, but an uncompressed master is crucial to preserving the integrity of the content over the long term.

Guidelines for Born-Digital Video Preservation

The integrity of the file formats submitted should be evaluated and the material should be migrated to a more preservation-friendly format if
appropriate. Since the material has already been digitized, there is no benefit to upsampling (increasing the resolution, sample size, or data rate).

When creating archival copies from born-digital video content, maintain the same frame rate, resolution, data rate, and sampling scheme as the original.

Use these recommendations for the best prospects for longevity.

  • Progressive scanning
  • MXF (.mxf) file format

Guidelines for Video Presentation

Compression must be used in order to reduce the file size to make online access feasible. Adobe Flash is recommended for streaming video, while MPEG-4 is recommended for files that will be downloaded by users.

  • MPEG-4 AVC (H.264) encoding (codec)
  • 320 x 240 resolution (assuming 4:3 original aspect ratio)
  • 256-600 kBps data rate
  • Adobe Flash (.flv) or MPEG-4 (.mp4) file format