Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information. It ensures that the context for how your data was created, analyzed and stored, is clear, detailed and therefore, reproducible.
National Information Standards Organization, 2004

Subject-Specific Guidance

Clinical Data

Researchers are urged to use standardized terminology wherever possible

Clinical metadata may include elements that pose a risk of patient identification. It is the institutions' responsibility to provide the appropriate privacy training for handling this information. Explore the NIAID GSCID/BRC Clinical Metadata Standard.

Consider consulting and using the following codes, identifiers, and terminologies in your data, as applicable:

Common Data Elements and Standards

  • CDEs provide a way to standardize data collection so that related data can be pooled and analyzed across multiple studies or to investigate relationships between data in unrelated datasets
  • NIH Common Data Elements (CDE) Repository
     

Patient Identifiers

  • Social Security Number
  • Taxpayer Identification Number
  • National Provider Number
     

Diagnosis and procedure codes

Drug and device codes

Experimental Biomedical Research

Metadata about scientific experiments are essential for finding, retrieving, and reusing data

There are several current projects at LMA (many NIH-funded) that are focused on developing recommended metadata standards for describing experimental biomedical research data. In addition to the mentioned guidelines, biomedical researchers are also encouraged to consult FAIRSharing.org, an educational resource, and portal to metadata standards, databases, and data policies for a variety of disciplines.

Please Note: When describing experimental biomedical research data, it is important to identify not only the canonical reagents but also the actual batches of those reagents that were used to create your data.

  • A canonical reagent is the ideal of the reagent, and its definition and description are true for all examples of that reagent
  • Batches are the physical lots (daughters) of the canonical reagent, and there is often slight variation between batches
     

NIH LINCS

“The LINCS project is based on the premise that disrupting any one of the many steps of a given biological process will cause related changes in the molecular and cellular characteristics, behavior, and/or function of the cell – the observable composite of which is known as the cellular phenotype. Observing how and when a cell’s phenotype is altered by specific stressors can provide clues about the underlying mechanisms involved in perturbation and, ultimately, disease.”

  • LINCS metadata standards
    • These metadata standards were developed to describe LINCS reagents, assays, and experiments. They provide guidance for required, required if applicable, and optional elements.
       

Illuminating the Druggable Genome (IDG) Consortium

“The goal of the Illuminating the Druggable Genome (IDG) program is to identify and provide information on proteins that are currently not well studied within commonly drug-targeted protein families.”

Proteomics Data

Metadata is crucial to interpret and reanalyze deposited datasets

The following resources provide metadata recommendations for proteomics, interactomics, and metabolomics research:

Additional Medical Metadata Standards

Controlled Vocabularies

Lists of predefined terms by a community or research group

Medical research and biomedical professional communities may employ controlled vocabulary standards such as:

Ontologies

Variety of controlled vocabulary that defines components and describes relationships among components

Most ontologies are used for interoperability among databases, some using (Web Ontology Language (OWL) or Resource Description Frameworks (RDF)). Here are some examples of ontologies used in biomedical research:

Technical Standards

Established norm or requirement for a repeatable technical task

Technical Standards establish norms or requirements for a repeatable technical task establishing uniform criteria, methods, processes, and practices. ISO standards are internationally agreed by experts, and create a formula that describes the best way of doing something.