AudioMD Metadata Schema Kathryn M. Ballard Emporia State University


Part I: AudioMD Metadata Schema Explained The Library of Congress Motion Picture, Broadcasting, and Recorded Sound Division as part of their Audio-Visual Prototyping project developed AudioMD in 2002. The National Audio-Visual Conservation Center was under construction and the library was seeking ways to document and prepare materials for their impending relocation. Audio materials are unique, varied and difficult to accurately document with traditional metadata schemas. AudioMD was designed to mitigate some of the challenges and provide a schema for collecting relevant information from audio sources. AudioMD has the ability create standalone metadata records, but is primarily used as an extension schema within the Metadata Encoding and Transmission Standard (METS) and Preservation Metadata Implementation Strategies (PREMIS). AudioMD allows for the detailing of structural, administrative and descriptive elements. Working well with several other Extensible Markup Language (XML) schemas, audioMD provides a good amount of interoperability. The wide range of audio materials in existence necessitates that it have a significant level of flexibility as well. AudioMD was specifically developed to describe the characteristics of audio materials and because of its narrow focus lacks the extensibility of some other schemas. AudioMDs schema is designed specifically to capture the technical aspects of audio materials. Every element of audioMD contains a portion of the technical information necessary to make sense of an audio object. Although it has been in

AUDIOMD SCHEMA existence for over a decade, audioMD is considered an interim measure and will continue to be used until a final schema is adopted. The audioMD schema consists of

four top-level elements comprised of lower level elements. The first top-level element is file data and consists of information relevant to digital sources. The file data element provides information such as, data rate, format location, format name, file size, format version, bits per sample and sample frequency. The second top-level element is physical description and is primarily concerned with the physical attributes and locations of materials. The physical description element seeks to provide information about the shape, dimensions, condition, physical format, and storage location of an audio object. This is where the majority of information about all analog resources and some digital resources is located. Audio information, the third top-level element describes the recording characteristics of audio sources. The lower level elements contain information that is shared by recorded audio sources regardless of format. It provides information about the duration of a recording, the number of audio channels a recording contains, the arrangement of those channels and the sound processing of the channels (e.g., mono, stereo, surround sound). The fourth top-level element is calibration information and provides information about the types and locations of calibration data. Calibration information makes it possible for the preservationists to ensure that an audio source most closely resembles the sound of the event it is recording.

AUDIOMD SCHEMA Part II: AudioMD Metatata Schema Example An example of the types of metadata that an audioMD record would contain can be seen below in the technical information for a digital audio file of Gene Autrys 1942 recording of the song Deep in the Heart of Texas. Element Name audioBlockSize (file size of the audio) dataRate (expressed in kpbs) formatName samplingFrequency numChannels codecName codecQuality duration bitsPerSample soundField Song Information 5.7MB 256kpbs QuickTime/MPEG-4/Motion JPEG 2000 44.1 kHz 2 AAC (Advanced Audio Coding) Lossy 00:02.43 16 Stereo

There were several complications to providing a link to a data dictionary for audioMD. The dictionary available from the Library of Congress was last updated in 2003 and bears little resemblance to audioMD in its present form. A data dictionary for the Metadata-Capture database contained newer definitions but did not reflect the recent changes in element names. Perhaps the most effective method for is to use the current

AUDIOMD SCHEMA schema documentation and then search the website of the Library of Congress for terms that remain unclear.

