Standards Based Archiving


File Formats



PDF/A-1:  Documents and Embedded Media

Definition: Adobe's Portable Document Format (PDF) is a document encoding system that maintains the original content, structure and appearance of a document across many computer platforms and communications networks. As of July 1, 2008 PDF has been recognized as international standard - ISO 32000-1:2008.

Reference: Federal Agencies Digitization Guidelines Initiative


TIFF:  Scanned Images

Definition: TIFF is a file format for storing and exchanging bitonal, grayscale, and color images. The term is now used without reference to the original source phrase: Tag or Tagged Image File Format. TIFF combines raster image data with a flexible tagged field structure for metadata, and is recognized as an international standard for archiving images.

Reference: Federal Agencies Digitization Guidelines Initiative



Benchmarks



Equipment  Native Resolution

Definition: An imaging system's ability to resolve finely spaced detail. The level of spatial detail that can resolved in an image. The maximum spatial frequency of any utility for an imaging system (limiting resolution).

Reference: Federal Agencies Digitization Guidelines Initiative


Capture Fidelity  Target Standards

Definition: Generally classified by two types - a device target and an object target. The device target is imaged and evaluated in isolation, while the object target is included with the analog object being digitized; preserving a record of known values and characteristics along with that object.

Reference: Federal Agencies Digitization Guidelines Initiative



Preservation



Archival Master Files:  Scanned Images

Definition: A file that represents the best copy produced by a digitizing organization, with best defined as meeting the objectives of a particular project or program. In many cases, the best copies are called preservation master files rather than archival master files. In some cases, best-copy files are defined in qualitative terms, as part of an approach that requires all archival or preservation master files to meet the same specifications, without regard to objectives that vary by category.

Archival master files represent digital content that the organization intends to maintain for the long term without loss of essential features. For analog originals, archival master files are produced by reformatting to high standards. The digital formats for archival master files are selected in terms of sustainability factors. For born digital originals, if the existing format is deemed sustainable for the long term, the files are retained as-is and called archival masters. If the existing format is deemed unsuitable for long-term retention, e.g., it is an obsolescent format, then the content may be transcoded and the new version retained as the archival master.

Reference: Federal Agencies Digitization Guidelines Initiative


Technical Metadata:  Digital Research Collections

Definition: Term strongly associated with the Preservation Metadata for Digital Materials (PREMIS) working group. The group defined a core preservation metadata set, supported by a data dictionary, and identified strategies for encoding, storing, and managing this metadata. Many data elements that are important for preservation are found in other categories, especially those classified as administrative.

Reference: Federal Agencies Digitization Guidelines Initiative



Storage



Repository Files:  Academic Research Collections

Definition: Master files of all types have permanent value and should be managed in an appropriate environment, e.g., one in which read and write executions are minimized and other preservation-oriented data management actions are applied. In contrast, derivative files are frequently accessed by end-users and are typically stored in systems that see repeated read and write executions.


Reference: Federal Agencies Digitization Guidelines Initiative


Repository Metadata:  Academic Research Collections

Definition: Structured information about an analog or digital object, a component of an object, or a coherent collection of objects. Metadata describing digital content may be embedded (Metadata, embedded) within a single file, incorporated within the "packaging" that is associated with a group of files (e.g., METS), placed in a related external file (e.g., XMP sidecar file), or in a system external to the digital file (e.g., a database) to which the digital file or files are linked via a unique key or association.

Reference: Federal Agencies Digitization Guidelines Initiative





© 2009 Virginia Tech, Information Technology Digital Archiving Initiative