Archiving for Preservation

Sustaining Digital Works

   Over the past twenty-five years computer technology has been consistent in one single way, it continually changes. Operating systems, software applications, typefaces, and processors have all migrated from their original configurations in order to offer increased capability and greater speed of operation. Unfortunately, digital file formats seldom move forward intact with these advancements. This is why current digital works offer no guarantee of compatibility with newer software and hardware in the future. And, unless we begin efforts to preserve our digital works today, they could be lost to future generations.

   Considering how we might approach preservation of digital works starts with the definition of what those works are. Digital files are made up of bits and bytes represented by characters organized in ways that allow for interpretation either as a functional operation, visual display, or audible sound. If we think of digital files in this way, then all digital works take on the characteristics of documents. Michael Buckland has suggested that because of this, defining a digital file is an elusive task making any satisfactory definition one based more on functionality than the traditional form, format, and medium (1997). Preserving the underlying functionality then is a key aspect for determining how digital files move forward operationally into the future.

   In 1992, for example, Adobe introduced PDF (portable document format) as an alternative for transporting and sharing digital print documents across platforms independent of hardware, font availability, or application software. The only requirement is a common reader that interprets the document information so that design elements, such as images and text display according to their original intent. While this format helps preserve functionality for PDF documents across platforms, it does not address compatibility issues for future versions of PDF. That issue was addressed in 2005, with the introduction of PDF/A. As an archival format PDF/A accounts for the anticipated changes to formatting, typefaces, and software compatibility that often cause digital works to fail. PDF/A has since become a recognized ISO (International Organization of Standards) standard for digital documents. PDF/A does not, however, account for links within a document that function to connect information with information. This lack of functionality to connect related information and graphics makes the PDF format unpopular with some, as critics point out the inability for linking data to data as a major oversight with the PDF archival standard.

   The issue that connectivity raises with regard to sustaining file operation is based on the source for each link and the digital resources pointed to by those connections. Servers are often replaced or taken off-line, and as a result the functional operations associated with links to a particular server can disappear. In many instances, like those associated with digital libraries and repositories, persistent links are created to allow for migration of information. But, support for persistent links alone is no guarantee that files can be maintained and continue to function over time.

   While most preservation efforts do provide a permanent location that maintains the integrity of the original format, those efforts often overlook the need to migrate the original files to current standards, or simply do not have the resources to address preservation holistically. This is why attempts at true preservation have to extend beyond the point of simply making copies of files, and creating multiple instances where those copies exist. File preservation must be paired with storage solutions and incorporate fallback strategies in the case of failures; either to the digital works themselves, or the systems that preserve those digital works.

   The challenge we face for sustaining digital works will not be solved overnight. And, the time to address preservation starts with the creation process. Recognizing the role standards will play is an important first step in order to retain some level of function for the digital works we create today. We must also begin to recognize and plan for the role standards will play in the future. Any effort we make to address preservation must incorporate multiple solutions for both archival storage and migration to newer technologies. And, preservation must be considered a priority for important digital works. Otherwise, any effort to address the functional aspects and data characteristics associated with digital works will be met with myriad conflicts for how the files were produced and the dependencies inherent with the diversity of operating systems and software applications in use.

   And without preservation, the fact remains that files you create today may no longer be usable in the future.

Discovery Commons

Working in partnership with University Libraries, the research repository provides a secure and protected environment to view and explore online resources, while offering coordinated support for digitizing, image development, and repository site development of content resources. Available repository sites allow researchers to find information readily, and create and preserve quick collections of pertinent data.

The research repository offers preservation support by requiring the use of ISO file formats for images, documents, and audio files. Considerations for other types of data is based on planned usage of the data, and projected access by future researchers. Please contact us if you have questions related to preservation or the services we provide as part of support for Discovery Commons.

– Virginia Tech, Discovery Commons Initiative