Instantiations, Components, And Essence Tracks

15 May 2009

As posted to PBCore Resources

PBCore instantiation records work well for documenting renditions of an asset that are composed of a single tape or file, but when an instantiation requires multiple tapes, reels or files what should the protocol be? How can PBCore be used to efficiently document a rendition of an asset that is composed of multiple objects, each with its own set of technical metadata? Disclaimer, this post is based on my own personal experience in using PBCore 1.2.1 and resulting conclusions.

Within a PBCore asset record every element may be applied multiple times (one asset may have as many titles, contributors, and instantiations as one desires); however from the perspective of the instantiation (which is a single rendition of an asset) much of the descriptive information may only occur once. For instance, an instantiation may only have one formatDigital (i.e. one mime_type), one formatGenerations, one formatFileSize. The PBCore instantiation element appears to be designed to both document a single item or a single file and to document “all the details on how the asset is actualized” (quote from the PBCore 1.2.1 XSD). However, in some cases, in order to document how an instantiation actualizes the asset, multiple files or multiple items are necessary. Here are three situational examples:

– an asset describing a musical album may have an instantiation that is one CD, then the digitized version of that CD comprises 10 digital files each representing a track. The 10 digital items together represent the same asset as the single-item CD,

– an asset describing a film exists in a collection as two instantiations: a three-reel 35mm film print and a single Digibeta (this is similar to the example that Mary Miller describes at http://www.pbcoreresources.org/article/dealing_with_multi_part_instantiations/),

– an asset documenting a television episode contains two instantiations: one being a single Digibeta tape and another as two elementary stream files (an .m2v video stream and an .wav audio stream).

All three of these examples refer to audiovisual material that changes in number of components needed to represent an asset over the reformatting process. In some types of reformatting the number goes from more to less (like example 2, the film transfer) and in some cases from less to more (link example 1, the digitization of a CD).

If PBCore instantiations are understood to only represent single-item instantiations then the individual digitized tracks of a CD or the individual reels of a film print would need to be documented in their own asset records, where one asset represents the CD and then 10 other assets represent the individual digitized tracks. This is obviously less efficient than treating the set of digitized tracks as one instantiation and the CD as another instantiation of the same asset. Another option could be to zip or tar the 10 tracks into one file, but this requirement for effective PBCore description has its own disadvantages. Alternatively a directory that contains the 10 file-based tracks could be defined by the instantiation.

Best practices for documenting multi-object instantiations are not clear. With the m2v and wav elementary streams, the two files need to work together to represent the asset, but they have their own unique values for ‘formatDigital’, ‘formatFileSize’, ‘formatDataRate’ and possibly their own ‘formatLocation’. All of these values may only occur once per instantiation. For the m2v and wav elementary streams to be defined as a single instantiation some options are:

– the two files could be moved into a directory or folder, which would serve the role of an audiovisual wrapper. In this case the formatDigital would be ‘application/x-not-regular-file’ (referring to the directory) the formatFileSize could be the directory size, etc.

– or the data from the individual files could be shoehorned into the instantiation fields meant for individual files, thus formatDigital would be “video/mpeg audio/x-wav” and formatFileSize could be the sum of the two file sizes.

– or the m2v and wav files could be either zipped or tarred into a single file or multiplexed into an audiovisual wrapper, so that the collection is then represented by a single file (the analog equivalent would be splicing together film reels in order that the metadata more cleanly fits into an instantiation record).

None of these options are ideal for describing a complex object, since potentially the levels of quality of resulting technical documentation become less precise, the implementation of instantiation becomes less standardized, or the metadata process potentially burdens collection management. This is the same sort of challenge that occurred in pre-1.2 versions of PBCore where discrete track-level metadata values had to be concatenated and labeled into single fields like formatDataRate = “Total 1930 kilobits/sec; Video 1700 kilobits/sec; Audio 230 kilobits/sec”. This procedure was documented by pbcore.org at http://www.pbcore.org/PBCore/formatDataRate.html that “the pbcoreInstantiation container should not be repeated in order to express a video data rate and an associated audio data rate. The two combined are part of a single instantiation for an asset”.

I have two suggestions regarding this potential challenge. The first would be documenting best practices the use PBCore 1.2.1 as is to document these complex objects in a way that fits the various examples above. The second suggestion would involve a modification to PBCore which would be to integrate an additional element in between instantiation and essenceTrack, perhaps called ‘component’. Typically PBCore would document single-component instantiations; however in cases where a single instantiation is made up of multiple tapes, reels or files, the instantiation would have as many component records each with its own technical metadata.

In this arrangement some of the values currently attached to instantiation would move to the component level. Whereas PBCore 1.2.1 is

instantiation { {formatIdentifier, formatIdentifierSource } dateCreated, dateIssued, formatPhysical, formatDigital, formatLocation, formatMediaType, formatGenerations, formatFileSize, formatTimeStart, formatDurations, formatColors, formatTracks, formatChannelConfiguration, language, alternativeModes {essenceTrack see below } {dateAvailableStart, dateAvailableEnd } { annotation }

essenceTrack {essenceTrackType, essenceTrackIdentifier, essenceTrackIdentifierSource, essenceTrackStandard, essenceTrackEncoding, essenceTrackDataRate, essenceTrackTimeStart, essenceTrackDuration, essenceTrackBitDepth, essenceTrackSamplingRate, essenceTrackFrameSize, essenceTrackAspectRatio, essenceTrackFrameRate, essenceTrackLanguage, essenceTrackAnnotation }

the incorporation of a component level of data could look like

instantiation { assemblyMode, formatMediaType, formatGenerations, formatFileSize, formatColors,, formatChannelConfiguration, language, alternativeModes, {dateAvailableStart, dateAvailableEnd } { annotation }

component { {componentIdentifier, componentIdentifierSource } dateCreated, dateIssued, componentPhysical, componentDigital, componentLocation, componentTimeStart, componentDuration, componentTracks, {essenceTrack see below } }

essenceTrack {essenceTrackType, essenceTrackIdentifier, essenceTrackIdentifierSource, essenceTrackStandard, essenceTrackEncoding, essenceTrackDataRate, essenceTrackTimeStart, essenceTrackDuration, essenceTrackBitDepth, essenceTrackSamplingRate, essenceTrackFrameSize, essenceTrackAspectRatio, essenceTrackFrameRate, essenceTrackLanguage, essenceTrackAnnotation }

In this draft I added a field called ‘assemblyMode’. Something like assemblyMode would be needed to document how the components are related to each other. In the case of the digitized CD, the components would be assembled through concatenation and played back-to-back, so assemblyMode could equal “concatenation”. With the m2v and wav elementary streams the assemblyMode would be “multiplexion” since the component needs to be multiplexed for playback. In the case of “concatenation” the total duration of the instantiation would equal the total durations of the components whereas if the assemblyMode is “multiplexion” then the instantiation’s duration is roughly equal to the duration of the component, so the value is relevant to how other pieces of metadata are determined.

Since the instantiation should contain “all the details on how the asset is actualized” (as stated by the PBCore 1.2.1 XSD), adding an addition element level to accommodate multi-tape or multi-objects would help this goal be achieved with cleaner and more descriptive data. I’m interested to hear if this is an issue another other PBCore users are thinking about and if there are any easier solutions that I’m missing.

David Rice

AudioVisual Preservation Solutions