“Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information.” It ensures that the context for how your data was created, analyzed and stored, is clear, detailed and therefore, reproducible. — National Information Standards Organization, 2004

Subject-Specific Guidance

  • Clinical Data

    Researchers are urged to use standardized terminology wherever possible

    Clinical metadata may include elements that pose a risk of patient identification. It is the institutions' responsibility to provide the appropriate privacy training for handling this information. Explore the NIAID GSCID/BRC Clinical Metadata Standard.

    Consider consulting and using the following codes, identifiers, and terminologies in your data, as applicable: 

    Common Data Elements and Standards

    • CDEs provide a way to standardize data collection so that related data can be pooled and analyzed across multiple studies or to investigate relationships between data in unrelated datasets
    • NIH Common Data Elements (CDE) Repository

    Patient Identifiers

    • Social Security Number
    • Taxpayer Identification Number
    • National Provider Number

    Diagnosis and procedure codes

    Drug and device codes

  • Experimental Biomedical Research

    Metadata about scientific experiments are essential for finding, retrieving, and reusing data

    There are several current projects at LMA (many NIH-funded) that are focused on developing recommended metadata standards for describing experimental biomedical research data. In addition to the mentioned guidelines, biomedical researchers are also encouraged to consult FAIRSharing.org, an educational resource, and portal to metadata standards, databases, and data policies for a variety of disciplines. 

    Please Note: When describing experimental biomedical research data, it is important to identify not only the canonical reagents but also the actual batches of those reagents that were used to create your data.

    • A canonical reagent is the ideal of the reagent, and its definition and description are true for all examples of that reagent
    • Batches are the physical lots (daughters) of the canonical reagent, and there is often slight variation between batches


    • “The LINCS project is based on the premise that disrupting any one of the many steps of a given biological process will cause related changes in the molecular and cellular characteristics, behavior, and/or function of the cell – the observable composite of which is known as the cellular phenotype. Observing how and when a cell’s phenotype is altered by specific stressors can provide clues about the underlying mechanisms involved in perturbation and, ultimately, disease.” 
    • LINCS metadata standards
      • These metadata standards were developed to describe LINCS reagents, assays, and experiments. They provide guidance for required, required if applicable, and optional elements.

    Illuminating the Druggable Genome (IDG) Consortium

    • “The goal of the Illuminating the Druggable Genome (IDG) program is to identify and provide information on proteins that are currently not well studied within commonly drug-targeted protein families.” 
    • IDG metadata standards
  • Proteomics Data

    Metadata is crucial to interpret and reanalyze deposited datasets

    The following resources provide metadata recommendations for proteomics, interactomics, and metabolomics research: 

    • HUPO Proteomics Standards Initiative
      • "The HUPO Proteomics Standards Initiative defines community standards for data representation in proteomics and interactomics to facilitate data comparison, exchange and verification."
    • Dai C, Füllgrabe A, Pfeuffer J, Solovyeva EM, Deng J, Moreno P, Kamatchinathan S, Kundu DJ, George N, Fexova S, Grüning B. "A proteomics sample metadata representation for multiomics integration and big data analysis." Nature Communications. 2021. 12(1): 1-8. https://doi.org/10.1038/s41467-021-26111-3
    • Snyder, M, Mias, G, Standberry, L, Kolker, E. "Metadata Checklist for the Integrated Personal OMICS Study: Proteomics and Metabolomics Experiments." Omics. 2014. 18(1): 81-85. https://doi.org/10.1089/omi.2013.0148

Additional Medical Metadata Standards