![]() [ bibnum.bnf.fr ] |
containerMD : implementation guidelines and examplesRecommended implementationscontainerMD has been designed for three possible uses listed below. Whenever used in conjunction with a more general metadata scheme, recommended best practice is to use these schemas to declare generic information and use containerMD only for container format specific fields. Use in conjunction with PREMISRecommended best practice is to use PREMIS semantic units for all information that is not specific to container files, and containerMD for all specific fields. In practice, this means using a ‹premis:object› of type "file" to express core information about the container file, and to use containerMD inside an ‹premis:objectCharacteristicsExtension› to express container-specific fields. These specific fields are ‹entriesInformation› (non verbose mode) and extension containers. We intend to add a mapping table between containerMD fields and PREMIS semantic units very soon. Here is an example of PREMIS used for an ARC file:
Use in conjunction with METSRecommended best practice is to use containerMD as an extension schema in a ‹mets:techMD› element. It is also recommended to use containerMD in conjunction with ‹premis:object›, according to the best practices mentioned above. This can be implemented either as parallel ‹mets:techMD› sections referred to from the same ‹mets:file›, or with a single ‹mets:techMD› containing a ‹premis:object› element itself containing a ‹containerMD› section in a ‹premis:objectCharacteristicsExtension› element. For verbose descriptions, that is, description of all the container and content files, you only need containerMD for its container-specific extensions. METS and PREMIS can manage the whole-part relationship, and PREMIS will be able to manage all core metadata about each file (container or contained). For non-verbose descriptions, containerMD ‹entriesInformation› can also be used for its aggregation mechanisms. Here is an example of a METS file describing a gzipped arc file. Most of the choices comply with the "Using PREMIS with METS" recommendations with the following additional choices:
Use as a standalone fileThis approach is not recommended unless containerMD is used as an output format of a tool. This is the only implementation where it is relevant to store provenance information inside containerMD (assessmentInformation element).
containerMD for ARC filesARC files contain ARC records that in turn contain files in their payload, generally harvested on the web. Users may want to describe the harvested files and/or the ARC records: they can do it by using, in the first case, either ‹entriesInformation› or ‹entry› elements, and in the second case, either ‹ARCEntries› or ‹ARCEntry› in a ‹entriesExtension› or ‹entryExtension› element, depending on whether they describe the content of a container file in a verbose or non-verbose mode. For example, if an ARC file contains an HTML file, the generic ‹entry› element will express information about the HTML file itself; on the opposite, the ‹ARCEntry› extension will express information about the ARC record itself, e.g. information about the protocol and response code for the harvested HTML file. containerMD for WARC filesThe use of containerMD for WARC files follows the same rules as those applying to ARC files, but defined in addition the following principles:
Also note that the verbose mode was not investigated for WARC files, therefore the ‹WARCEntry› element is not completely defined. Please feel free to contact us if you would want to further elaborate this element for your own needs. containerMD for disk image filesUpon Harvard Libraries' request, the containerMD metadata schema was extended in 2020 to address the need for describing a disk image file, and in particular the media from which it was created (media type, manufacturer, serial number and capacity) and its file system. All information needed were defined in a ‹diskImageContainer› element, which contains ‹diskImageType› (values: "logical", "physical" or "unspecified"), ‹sourceMedia› and ‹fileSystems› elements. Last updated: 2020, september 6th |