In the previous sections, examples of data structures or metadata element sets have been introduced. "The choice of terms or words (data values) and the selection, organization, and formatting of those words (data content) are two other types of standards that must be used in conjunction with an agreed-upon data structure" (CCO Introduction, 2005). This part provides resource related to data values and data content.
6.1 Using controlled vocabularies for named entities, time and space, and subjects
Almost all metadata standards require or recommend the use of controlled vocabularies for some elements.
Examples from Dublin Core 1.1:
Element Name Subject Label: Subject and Keywords Definition: A topic of the content of the resource. Comment: Typically, Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.
Element Name: Type Label: Resource Type Definition: The nature or genre of the content of the resource. Comment: Type includes terms describing general categories, functions, genres, or aggregation levels for content. Recommended best practice is to select a value from a controlled vocabulary (for example, the DCMI Type Vocabulary [ DCT1 ]). To describe the physical or digital manifestation of the resource, use the FORMAT element.
Element Name: date Label: Date Definition: A point or period of time associated with an event in the lifecycle of the resource. Comment: Date may be used to express temporal information at any level of granularity. Recommended best practice is to use an encoding scheme, such as the W3CDTF profile of ISO 8601.
- Examples from VRA Core 4.0:
Description: A defined style, historical period, group, school, dynasty, movement, etc. whose characteristics are represented in the Work or Image. Cultural and regional terms may be combined with style and period terms for display purposes.
Data Values (controlled): recommend AAT.
Description: Terms or phrases that describe, identify, or interpret the Work or Image and what it depicts or expresses. These may include generic terms that describe the work and the elements that it comprises, terms that identify particular people, geographic places, narrative and iconographic themes, or terms that refer to broader concepts or interpretations. Use of a Subject Authority, from which these data values may be derived, is recommended.
Data Values: recommend AAT, TGN, LCTGM, ICONCLASS, LCSH, LCNAF, Sears Subject Headings.
Note: LCTGM = Thesaurus for Graphic Materials (by LC); LCNAF = LC Name Authority File. ICONCLASS is a classification designed for art and iconography.
Definition: Identifies the specific type of WORK, COLLECTION, or IMAGE being described in the record.
Data Values for WORK AND COLLECTION type (controlled vocabulary): recommend AAT. Recommended data values for IMAGE WORK type (AAT terms): black-and-white transparency, color transparency (for slides or positive transparencies), black-and-white negative, color negative (for negative transparencies), photographic print (for photographic prints), or digital image.
The following controlled vocabularies are usually recommended by the metadata standards or best practice guide. For a more completed list, seeanother source.
DCMI Type Vocabulary
A general, cross-domain list of approved terms that may be used as values for the Resource Type element to identify the genre of a resource.
[MIME] Internet Media Types
May be used as values for the Format element.
RFC 4646 Tags for Identifying Languages
ISO 3166 - Codes for the representation of names of countries.
ISO 639 Codes for the representation of names of languages
Provides two sets of language codes for the representation of names of languages.
- Part 1: Alpha-2 code includes identifiers for major languages of the world for which specialized terminologies have been developed. (http://xml.coverpages.org/iso639a.html)
- Part 2: Alpha-3 code (http://www.loc.gov/standards/iso639-2/php/English_list.php)
W3C Date and Time Formats (W3C-DTF)
Note: Only a small number of thesauri and classification schemes are listed below. They are frequently mentioned in metadata standards. A more completed list is available online.
Library of Congress Subject Headings (LCSH)
FAST (Faceted Application of Subject Terminology) Authority File
An adaptation of the Library of Congress Subject Headings (LCSH) with a simplified syntax. The headings have been built into FAST authority records and accessible through the OCLC FAST Test Databases Web site.
Medical Subject Headings (MESH)
MeSH consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. There are 22,568 descriptors in MeSH. In addition to these headings, there are more than 139,000 headings called Supplementary Concept Records (formerly Supplementary Chemical Records) within a separate thesaurus.
Art and Architecture Thesaurus (AAT)
The AAT is a structured vocabulary of more than 133,000 terms, descriptions, bibliographic citations, and other information relating to fine art, architecture, decorative arts, archival materials, and material culture.
Linked Data Sparql Endpoint: http://vocab.getty.edu/queries#Finding_Subjects
Library of Congress Thesauri
Thesaurus for the Global Legal information Network (GLIN)
Now used for The Global Legal Information Network's multi-national database of legislation, this thesaurus has been under continuous development since 1950.
Legislative Indexing Vocabulary (LIV)
The thesaurus was developed by the Congressional Research Service for use with legislative and public policy material.
Thesaurus for Graphic Materials
The Thesaurus for Graphic Materials is a tool for indexing visual materials by subject and by genre/format. The thesaurus includes more than 7,000 subject terms and 650 genre/format terms to index types of photographs, prints, design drawings, ephemera, and other pictures.
Dewey Decimal Classification (DDC)
Website about DDC http://www.oclc.org/dewey/default.htm
Library of Congress Classification
Available as Linked Data:http://id.loc.gov/authorities/classification.html
Universal Decimal Classification (UDC)
Website about UDC http://www.udcc.org/index.php/site/page?view=about
UDC Summary http://www.udcc.org/udcsummary/php/index.php
The ACM Computing Classification System [2012 Version], Association for Computing Machinery
VIAF (The Virtual International Authority File)
The VIAF combines multiple name authority files into a single OCLC-hosted name authority service. Contributed by 34 agencies in 29 countries (as of July 2014).
The Union List of Artist Names (ULAN)
The ULAN is a structured vocabulary containing more than 225,000 names and biographical and bibliographic information about artists and architects, including a wealth of variant names, pseudonyms, and language variants.
Linked Data SPARQL Endpoint: http://vocab.getty.edu/queries#ULAN-Specific_Queries
The Getty Thesaurus of Geographic Names (TGN)
The TGN is a structured, world-coverage vocabulary of 1.3 million names, including vernacular and historical names, coordinates, place types, and descriptive notes, focusing on places important for the study of art and architecture.
Linked Data SPARQL Endpoint:http://vocab.getty.edu/queries#TGN-Specific_Queries
LC Name Authority file = Anglo-American Authority File (AAAF)
Includes several millions of name authority records for personal, corporate, meeting, and geographic names.
Linked Data version: http://id.loc.gov/authorities/names.html
The best practice guides prepared by various communities and projects usually provide detailed guidelines regarding how to assign values when creating metadata records. The following are examples of standards for data content to be followed in particular communities.
Cataloguing Culture Objects (CCO), A Guide to Describing Cultural Works and Their Images
Provides guidelines for selecting, ordering, and formatting data used to populate elements in a catalogue record, in order to to advance the increasing move toward shared cataloguing and contribute to improved documentation and access to cultural heritage information.
Guidelines for Encoding Bibliographic Citation Information in Dublin Core Metadata
It deals primarily with bibliographic citations for a resource within its own metadata, but some guidelines for describing references to other resources are also indicated.
Describing Archives: A Content Standard (DACS) Society of American Archivists (SAA) http://www.archivists.org/governance/standards/dacs.asp
An output-neutral set of rules for describing archives, personal papers, and manuscript collections, and can be applied to all material types.
DLESE Best Practices
Lists the metadata field definitions, cataloging best practices, and vocabulary explanations for the metadata fields in the DLESE Cataloging System.
LODE-BD Recommendations 2.0
-- Report on how to select appropriate encoding strategies for producing Linked Open Data (LOD)-enabled bibliographic data.
Guidelines released by the AIMS of the Food and Agriculture Organization (FAO) of the United Nations.
RDA: Resource Description and Access
RDA Toolkit: http://www.rdatoolkit.org/
A comprehensive set of guidelines and instructions on resource description and access covering all types of content and media.
Many metadata standards usually include the best practice guides in the specifications, see Part 4 for the list of standards.
<-- Back to Table of Contents |||| Go to Next Section -->