README Files are a common way to document the contents and structure of a folder and/or a dataset so that a researcher can locate the information they need. Data documentation can be maintained in a variety of forms. Explore additional Documentation & Metadata practices.

What is a README File?

Provide a clear and concise description of all relevant details about data collection, processing, and analysis in a README file. This will help others interpret and reanalyze your dataset.

README files are created for a variety of reasons:

  • to document changes to files or file names within a folder
  • to explain file naming conventions, practices, etc. "in general" for future reference
  • to specifically accompany files/data being deposited in a repository

It is best practice to create a README file for each dataset regardless of whether it is being deposited in a repository because the document might become necessary later.

  • A good data practice is to store a readme.txt with each distinct dataset that explains your file naming convention along with any abbreviations or codes you have used.
  • Write your README file as a plain text file, or enhanced with Markdown (a lightweight markup language for added versatility, e.g., to include headings, lists, and links). Avoid proprietary formats, such as Microsoft Word, whenever possible. However, PDF is acceptable when formatting is important.
  • If you deposit your final datasets in a data repository, the repository may ask you to provide a README file with additional details about your datasets, such as methodological information or sharing/access information. Creating a README file at the beginning of your research process, and updating it consistently throughout your research, will help you to compile a final README file when your data is ready for deposit.

There are some cases where a minimum README file is acceptable, and we provide an example of a basic dataset README below.

In other cases you might need a more comprehensive one, and we recommend referencing the Cornell University Research Data Management Service Group's README template.

README Resources

Template: HMS Basic Dataset README

This README is intended for capturing information about data collected during day-to-day work in the lab.

Download a Word document or text file version of this template.

When organizing data for a publication, submitting to a data repository, or for archiving, more detailed README files should be produced.

Title or simple description of the dataset

Key contacts

  • Person responsible for collecting the data
  • Other collaborators who helped create the dataset (optional)
  • Principal Investigator (optional)

Lab notebook reference

Provide reference info for lab notebook entries that describe the work carried out to produce this dataset. For example: include notebook name, relevant dates and pages, if appropriate.

Description of folder/file contents

Brief description of folder contents that will allow readers to quickly understand the data stored in the folder. 

For example: information about file organization within the folder, file naming conventions, replicates, or the different analyses being performed.

More detailed description of data (optional)

The recommendations for the basic README template above represent the minimum recommended annotation for data in HMS systems.

For some labs or some projects/experiments, it might be important to include additional descriptions such as:

  • Project/experiment description, including the goals of the experiment or analysis related to this dataset.
  • Column headings for tabular data if the meaning of the column heading is not apparent in the dataset. Clarify units of measurement, if needed. These elements may also be included in a data dictionary.
  • File formats, if there multiple.
  • Versioning information if these datasets relate to other datasets.

Example: HMS Basic Dataset README

A filled out example of the HMS Basic Dataset README for capturing information about data collected during a discrete experiment.

Title: Raw Images for daf2-age1 Experiment (YYYYMMDD)

Key contacts:

  • Judy collected images for daf2-age1experiment A
  • The HMS Core for Imaging Technology & Education provided equipment and training
  • Smith Lab

Lab notebook reference: Judy’s Notebook, YYYYMMDD, pages 02-05

Description of folder/file contents: Images for daf2-age1 Experiment. The base file name for each image is composed of the name of the experiment, the ID number of the instrument used, the date and time that the image was captured, and the unique identifier of the image.

  • Example: daf2-age1_14052_20150412T0515_005.tif

More detailed description of data: The experiment looked at tissue on a pre-prepared slide imaged with confocal fluorescence microscopy. (Nikon C1 inverted microscope). Image shape is (16, 512, 512, 3). That is 512x512 pixels in X-Y, 16 image slices in Z, and 3 color channels (emission wavelengths 450nm, 515nm, and 605nm, respectively). Real space voxel size is 1.24 microns in X-Y, and 1.25 microns in Z. Data type is unsigned 16-bit integers. Images are .tif files.

Additional Guidance

More examples and templates to help you write your READMEs!
Data Discussion on README Files YouTube still
Data Discussion Fall 2024
Watch this video to get started with README files.