Appendix A. Metadata Standards -->

Appendix B. Value Encoding Schemes
and Content Standards

[Note: links are validated and updated frequently. Red star* = updated* or added**. Updated 2025-02-02]

1. Standardized vocabularies

DCMI Type Vocabulary

* https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#section-7
A general, cross-domain list of Dublin Core Metadata Initiative (DCMI) approved terms that may be used as values for the resource type element to identify the genre of a resource.

MIME Internet Media Types

Originally called MIME (Multipurpose Internet Mail Extensions) types. https://www.iana.org/assignments/media-types/
Specifies identifiers for media types for file formats on the Internet.  

**RFC 5646 Tags for Identifying Languages

This document describes the structure, content, construction, and semantics of language tags for use in cases where it is desirable to indicate the language used in an information object. It also describes how to register values for use in language tags and the creation of user-defined extensions for private interchange.

ISO 3166 - English country names and code elements

Provides a standard numeric and 2-letter and 3-letter alphabetic codes for countries or areas of special sovereignty. This standard family includes Part 1: Country Codes and Part 2: Country Subdivision Code.  
* https://xml.coverpages.org/country3166.html

ISO 639 Codes for the representation of names of languages

Provides two sets of language codes for the representation of names of languages.
Part 1: Alpha-2 code
includes identifiers for major languages of the world for which specialized terminologies have been developed. (https://xml.coverpages.org/iso639a.html)
Part 2: Alpha-3 code
(https://www.loc.gov/standards/iso639-2/php/English_list.php) contains identifiers for all of the languages represented in part 1 and includes additional languages that have significant bodies of literature. It also provides identifiers for groups of languages, such as language families. When taken together these indirectly cover most languages of the world.

Language Metadata Table (LMT)

It’s IETF BCP 47 compliant, but provides a fixed set of language codes to facilitate interoperability. 
* LMT V6.0 January 4, 2025 Includes *364 language codes and display values – with more being researched.

W3C Date and Time Formats (W3C-DTF)

Provides encoding rules for dates and times. As a profile based on ISO 8601 Data elements and interchange formats -- Information interchange -- Representation of dates and times, it defines a restricted range of formats. It also expresses the year as four digits in all cases.

*Revised in 2019. ISO 8601:2019. Date and time — Representations for information interchange

Information: https://www.iso.org/standard/70907.html

2. Subject Headings Lists and Thesauri

Library of Congress Subject Headings (LCSH)

A comprehensive list of subject headings in print. Subject authority headings can be accessed through Library of Congress Authorities at https://authorities.loc.gov/

*Available as Linked Data: https://id.loc.gov

FAST (Faceted Application of Subject Terminology) Authority File

An adaptation of the Library of Congress Subject Headings (LCSH) with a simplified syntax. The headings have been built into FAST authority records and accessible through the OCLC FAST Test Databases Web site at *https://fast.oclc.org/searchfast/

Art and Architecture Thesaurus (AAT)

* https://www.getty.edu/research/tools/vocabularies/aat/index.html
Los Angeles: J. Paul Getty Trust, Vocabulary Program, 2000-.
A controlled vocabulary for fine art, architecture, decorative arts, archival materials, and material culture for the purposes of indexing, cataloging, searching, as well as research tools. The facets are conceptually organized in a scheme that proceeds from abstract concepts to concrete, physical artifacts

*Linked Data SPARQLEndpoint: https://vocab.getty.edu/queries#Finding_Subjects

Medical Subject Headings (MESH)

A comprehensive controlled vocabulary produced by the National Library of Medicine (NLM) and used for indexing, cataloging, and searching for biomedical and health-related information and documents. MeSH descriptors are organized in 16 categories (sometimes referred to as "trees").

Thesaurus for Graphic Materials (TGM)

* https://www.loc.gov/pictures/collection/tgm/

Developed by the Library of Congress for indexing visual materials.
*Available as Linked Data: https://id.loc.gov

Information Standards in Heritage (FISH) *Terminologies

* https://heritage-standards.org.uk/fish-vocabularies/
* was at: http://thesaurus.english-heritage.org.uk/
A list of *20+ thesauri

A set of thesauri developed by the National Monuments Record Centre, English Heritage project (officially known as Historic Buildings and Monuments Commission for England). The thesauri include Monument Types, Archaeological Objects, Building Materials, Defence of Britain, Components, Maritime Place Names, Maritime Craft Type, Maritime Cargo, Evidence Thesaurus, Archaeological Sciences, Thred Thesaurus, and Historic Aircraft Type.
Developed by English Heritage, a Non-Departmental Public Body established by the National Heritage Act 1983.

Link to the Forum on Information Standards in Heritage (FISH), and the INSCRIPTION terminology standard.

INSPEC Thesaurus

A thesaurus of the INSPEC database for scientific and technical literature in physics, electrical engineering, electronics, communications, control engineering, computers and computing, and information technology. Produced by the Institution of Engineering and Technology (IET).
Updates:* *

* * EIGE’s Gender Equality Glossary and Thesaurus

A specialized terminology tool focusing on the area of gender equality

* * European Language Social Science Thesaurus (ELSST)

The European Language Social Science Thesaurus (ELSST) is a broad-based, multilingual thesaurus for the social sciences. It is currently available in more than 10 languages.

* * Homosaurus

The Homosaurus is an international linked data vocabulary of Lesbian, Gay, Bisexual, Transgender, and Queer (LGBTQ) terms. This vocabulary is intended to function as a companion to broad subject term vocabularies, such as the LCSH.



AGROVOC is a valuable tool for data to be classified homogeneously, facilitating interoperability and reuse. AGROVOC is a multilingual and controlled vocabulary designed to cover concepts and terminology under Food and Agriculture Organization of the United Nations (FAO)'s areas of interest.
AGROVOC Thesaurus online: https://agrovoc.fao.org/browse/agrovoc/en/
* * SPARQL endpoint (with templates): https://agrovoc.fao.org/sparql

* UNESCO Thesaurus

The UNESCO Thesaurus is a controlled and structured list of terms used in subject analysis and retrieval of documents and publications in the fields of education, culture, natural sciences, social and human sciences, communication and information. Continuously enriched and updated, its multidisciplinary terminology reflects the evolution of UNESCO's programmes and activities..

UNESCO Thesaurus online: https://vocabularies.unesco.org/browser/thesaurus/en/
SPARQL endpoint (with query examples): https://vocabularies.unesco.org/sparql-form/

3. Classification schemes


Dewey Decimal Classification (DDC)

**Info: https://www.oclc.org/en/dewey.html
*DDC 23 Summaries (Download pdf from https://www.oclc.org/content/dam/oclc/dewey/ddc23-summaries.pdf) *****DDC 23 New Features

Library of Congress Classification

*Info and Outline: https://www.loc.gov/catdir/cpso/lcco/
**Available as Linked Data: https://id.loc.gov/authorities/classification.html
URL: http://id.loc.gov/authorities/classification

Universal Decimal Classification

*Info: https://udcc.org/
*UDC Summary (UDCS) https://www.udcsummary.info
**The UDC Summary (udcS) provides a selection of around 2,400 classes from the whole scheme which comprises more than 69,000 entries.


The ACM Computing Classification System (CCS) [*2012 Version]

** https://dl.acm.org/ccs
A subject classification system for computer science devised by the Association for Computing Machinery (ACM).
Intro: * https://www.acm.org/publications/class-2012

National Library of Medicine (NLM) Classification

*was at: http://wwwcf.nlm.nih.gov/class/
A library classification system covering the fields of medicine and preclinical basic sciences. It utilizes schedules QS-QZ and W-WZ, permanently excluded from the Library of Congress Classification Schedules and is intended to be used with the LC schedules. The Index to the NLM Classification consists primarily of Medical Subject Heading (MeSH) concepts used in cataloging.
**Outline: https://classification.nlm.nih.gov/outline


A classification system for describing and classifying the subject of images represented in various media such as paintings, drawings and photographs.

International Classification of Diseases (ICD)

Full title: International Statistical Classification of Diseases and Related Health Problems (ICD), WHO

  • allows the systematic recording, analysis, interpretation and comparison of mortality and morbidity data collected in different countries or regions and at different times;
  • ensures semantic interoperability and reusability of recorded data for the different use cases beyond mere health statistics, including decision support, resource allocation, reimbursement, guidelines and more.

ICD-10 Browser (Latest version https://icd.who.int/browse10/2019/en)
ICD-11 Homepage (https://icd.who.int/en)
Refer to the WHO Family of International Classifications (FIC)

The United Nations Standard Products and Services Code® (UNSPSC®)

Managed by GS1 US™ for the UN Development Programme (UNDP), UNSPSC is an open, global, multi-sector standard for efficient, accurate classification of products and services, encompassing a five level hierarchical classification codeset.

May browse and download the current version of the code (pdf) at no cost.

4. Name authority files

* VIAF (The Virtual International Authority File)

*The VIAF® (Virtual International Authority File) combines multiple name authority files into a single OCLC-hosted name authority service.

The Union List of Artist Names (ULAN)

* https://www.getty.edu/research/tools/vocabularies/ulan/index.html
Los Angeles: J. Paul Getty Trust, Vocabulary Program, 2000-.
*A structured vocabulary of artist names and biographical information.

*Linked Data SPARQL Endpoint: http://vocab.getty.edu/queries#ULAN-Specific_Queries

The Getty Thesaurus of Geographic Names (TGN)

Los Angeles: J. Paul Getty Trust, Vocabulary Program, 1988-.
*A structured vocabulary of geographic names for indexing art and architecture.

*Linked Data SPARQL Endpoint: http://vocab.getty.edu/queries#TGN-Specific_Queries

LC Name Authority file (LCNAF)

Includes several millions of name authority records for personal, corporate, meeting, and geographic names.
*Available as Linked Data: https://id.loc.gov

Alexandria Digital Library (ADL) Gazetteer

A product of the Alexandria Digital Library (ADL), a distributed digital library with collections of georeferenced materials developed at the University of California, Santa Barbara.

5. Content Standards and Best Practice Guides

Cataloging Cultural Objects: A Guide to Describing Cultural Works and Their Images (CCO)

Authored by Baca, M., P. Harpring, E. Lanzi, L. McRae, and A. Whiteside on behalf of the Visual Resources Association
** Homepage, with links to downloadable PDF
* Access the complete PDF of Cataloging Cultural Objects : https://www.vraweb.org/s/CatalogingCulturalObjectsFullv2.pdf

Specifies a set of core elements, comprising the most important descriptive information necessary to create a record for a work and an image. It provides guidelines for selecting, ordering, and formatting data used in cataloging to improve documentation and access to cultural heritage information.

Describing Archives: A Content Standard (DACS)

Society of American Archivists (SAA)
Info: *https://www2.archivists.org/groups/technical-subcommittee-on-describing-archives-a-content-standard-dacs/dacs
*Most current edition: https://github.com/saa-ts-dacs/dacs
*was at: http://www.archivists.org/governance/standards/dacs.asp

A content standard of SAA. DACS outlines the elements that must be included at different levels of description and describes how those elements should be implemented for describing archives, personal papers, manuscript collections, and other types of materials.

Anglo-American Cataloguing Rules. 1988 2nd Rev. ed. (AACR2)

Info: *https://en.wikipedia.org/wiki/Anglo-American_Cataloguing_Rules
*was at: http://www.aacr2.org/about.html
The bibliographic standard designed for use in the construction of catalogues and other lists in general libraries of all sizes. The rules cover the description of, and the provision of access points for, all library materials commonly collected.

RDA: Resource Description and Access

RDA Toolkit: http://www.rdatoolkit.org/
RDA is built on the foundations established by AACR. It will provide a comprehensive set of guidelines and instructions on resource description and access covering all types of content and media.

Best Practices for OAI Data Provider Implementations and Shareable Metadata

A joint initiative between the Digital Library Federation and the National Science Digital Library

Best Practices for Shareable Metadata


Part of the Best Practices for OAI Data Provider Implementations and Shareable Metadata, it is a joint initiative between the Digital Library Federation and the National Science Digital Library.

DLESE Metadata Best Practices

The Digital Library for Earth System Education (DLESE) Collections Accessioning Taskforce
*was at: http://www.dlese.org/Metadata/collections/metadata-best-practices.htm
Provides guidelines and checklists to help in the generation of metadata records that are effective and efficient for library use. The document describes metadata quality guidelines, cataloging best practices, and individual record checks.

*Data on the Web Best Practices

W3C Recommendation 31 January 2017

Section 8.2. Metadata https://www.w3.org/TR/dwbp/#metadata

This document provides Best Practices related to the publication and usage of data on the Web designed to help support a self-sustaining ecosystem.

*W3C Accessibility Standards

W3C Accessibility Standards Overview || All W3C Standards and Drafts Updated 2024-12

**6. Artificial Intelligence (AI) Standards and Guidelines

ISO: Standards by ISO/IEC JTC 1/SC 42 Artificial intelligence

[a list of standards/or projects]

EU: EU Artificial intelligence Act

[Webpage]; [The AI Act Explorer]


Guidance for Generative AI in Education and Research, 2023

U.S. The White House:

Blueprint for an AI Bill of Rights, 2022-

  • This a set of five principles and associated practices to help guide the design, use, and deployment of automated systems to protect the rights of the American public in the age of AI.


AI & the Web: Understanding and managing the impact of Machine Learning models on the Web, 2024-

MPEG (Moving Picture Expert Group):

MPEG Standards & Exploration, a full overview over all standards developed by MPEG

MPAI (Moving Picture, Audio and Data Coding by Artificial Intelligence) Artificial Intelligence Framework (MPAI-AIF) Version 2.0

NIST (National Institute of Standards and Technology) Standards.gov.

AI Standards Development Activities with Federal Involvement, 2024-.

Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, 2024-07-25