README Files

README Files are a common way to document the contents and structure of a folder and/or a dataset so that a researcher can locate the information they need. Data documentation can be maintained in a variety of forms. Explore additional Documentation & Metadata practices.

Provide a clear and concise description of all relevant details about data collection, processing, and analysis in a README file. This will help others interpret and reanalyze your dataset.

README files are created for a variety of reasons:

to document changes to files or file names within a folder
to explain file naming conventions, practices, etc. "in general" for future reference
to specifically accompany files/data being deposited in a repository

It is best practice to create a README file for each dataset regardless of whether it is being deposited in a repository because the document might become necessary later.

A good data practice is to store a readme.txt with each distinct dataset that explains your file naming convention along with any abbreviations or codes you have used.
Write your README file as a plain text file, and avoid proprietary formats, such as Microsoft Word, whenever possible. However, PDF is acceptable when formatting is important.
If you deposit your final datasets in a data repository, the repository may ask you to provide a README file with additional details about your datasets, such as methodological information or sharing/access information. Creating a README file at the beginning of your research process, and updating it consistently throughout your research, will help you to compile a final README file when your data is ready for deposit.

README Resources

README Template

Title of dataset

Name/institution/contact information for:

Principal Investigator (or person responsible for collecting the data)
Data manager or custodian

File name structure

Structure: Provide the template you are using for your filenames
Attributes: Describe the attributes used to name the files
Codes: Provide a complete list of any codes/abbreviations used
Provide examples of the above items

File formats

Provide a list of all file formats present in this dataset. If you need to convert or migrate your data files from one format to another, be aware of the potential risk of the loss or corruption of your data and take appropriate steps to avoid/minimize it
File Format Examples:
- Databases: XML, CSV
- Geospatial: SHP, DBF, GeoTIFF, NetCDF
- Moving Images: MOV, MPEG, AVI, MXF
- Audio: WAVE, AIFF, MP3, MXF
- Numbers/statistics: ASCII, DTA, POR, SAS, SAV\Images: TIFF, JPEG 2000, PDF, PNG, GIF, BMP
- Text: PDF/A, HTML, ASCII, XML, UTF-8
- Graphs: JSON, YAML, XML

Column headings for tabular data

For tabular data, list and define column headings:
- Units of measurement
- Data formats, such as YYYY/MM/DD
- Calculations
- Versioning: Establish a procedure for documenting changes in files. One option is to create a changelog in this README file, where every step that will change the output files is listed.

Example README File

Dataset Title: Raw Images for Experiment A, Smith Lab

Principal Investigator: John Smith, PI, 555-555-5555, jsmith@hms.harvard.edu

File Naming Convention: ExperimentName_InstrumentID_CaptureDateTime_ImageID.tif
The base file name is composed of the name of the experiment, the ID number of the instrument used, the date and time that the image was captured, and the unique identifier of the image.

Attributes: Also see the Codes section for a list of instruments and their ID numbers

ExperimentName = Name of the experiment
Instrument ID = Five-digit code assigned to the lab instrument
CaptureDateTime = Date and time at which the image was captured, in YYYYMMDD format
Image ID = Three-digit unique identifier for image, such as 001, 002, 003

Codes:

[List of instruments and IDs]

Examples:

File formats: daf2-age1_14052_20150412T0515_005.tif
Versioning: All changes to this dataset will be documented in a changelog in this README file

Additional Guidance

Cornell University Research Data Management Service Group's excellent README template
Kristin Briney's Write a Project-Level README.txt exercise in The Research Data Management Workbook
Harvard Biomedical Data Management's README File Checklist
Harvard Biomedical Data Management's Metadata Worksheet

README Files

What is a README File?

README Resources

README Template

Example README File

Additional Guidance