An NIEHS-funded administrative supplement to an existing project (R01 ES027825) led by Dr. Maitreyi Mazumdar (Boston Children’s Hospital, Harvard Medical School) evaluated several approaches for applying FAIR principles. This data management-focused administrative supplement included a new collaboration with Julie Goldman, Research Data Services Librarian at Harvard Medical School’s Countway Library. This collaboration identified several steps researchers can consider when developing data management plans for their studies.
The National Institutes of Health Strategic Plan for Data Science has a guiding principle that all research data should adhere to FAIR Data Principles. This acronym stands for Findable (F), Accessible (A), Interoperable (I), and Reusable (R). The FAIR concept was originally proposed in a 2016 Scientific Data article. The combination of these four components will help create datasets and research findings that are widely available and usable to enhance future research studies.
Each of these terms sounds useful as a concept, such as making data Accessible, but what does that really mean?
How can these concepts be applied to a research study?
The study team led by Dr. Maitreyi Mazumdar set out to find answers to these questions and shared some of their findings at the recent Society for Epidemiologic Research 2020 Conference (Poster: Maximizing research value and use through enhanced data documentation).
When thinking about FAIR Principles, a key concept to consider is that the end goal is to share data and study documentation in a way that reaches as many people as possible. This will most likely include making some data and study descriptions available online. While not all data and study documentation can be shared, potentially due to privacy concerns, protected health information, file size, or other reasons, you may be able to share some information about the study and provide a method for people to request more thorough descriptions or potentially access to the study data itself. Even if you can’t post the dataset itself, you could write up a description that covers the who, what, when, where, and why of the study. Providing these answers on a publicly accessible website will help others learn about the study, and you may find this information useful to refer back to in the future as well.
There are several ways that you can share your study online. Some journals provide ways to publish publicly accessible supplemental files. You may have your own research group website where you can post information about the study. Your institution may have a digital repository that should be used, or there may be other options to consider, such as Dataverse, Zenodo, Open Science Framework, GitHub, Dryad, and others.
If you’re interested in applying FAIR principles to one of your studies, please consider the following questions below:
- Pick your study: A general appreciation for FAIR principles is helpful, but you need to pick a specific study to apply them.
- Describe the study: Write a short description of the study that answers the who, what, when, where, and why of the study. Be sure to note the funding source too, if applicable. When writing the description, be sure to use commonly accepted terms and phrases that may match how other studies refer to the same information. One way to do this is to select an ontology. If your study is in the medical sciences, BioPortal may be helpful. For example, the study team led by Dr. Mazumdar at Boston Children’s Hospital (BCH) used Medical Subject Headings (MeSH) to identify key terms and collected additional gene annotations from dbSNP.
- Inventory the study information available: You need to create a short catalog of what you have related to the study, such as datasets, imaging files, analysis code, clinical report forms, and so forth. For example, the BCH study team merged multiple datasets into a single REDCap database that combined several samplings from the same study population in Bangladesh. The team also developed detailed README files to annotate the study.
- Select what you can share: Review the inventory you just must and identify which items are suitable for sharing.
- Select what you want to share: Not everything may be ready to share. You may have image files that would require substantial editing to remove identifiable information. Or a dataset may be several TB in size which isn’t readily shared on a website. By identifying the items you want to share, you can help determine where these items can be shared later on. For example, the BCH team identified a list of variables that would need to be removed for a fully de-identified dataset.
- Receive approval to share: Be sure to check with the study team members and any associated institutions to make sure you can share. One place to start could be a sponsored programs office. The project’s funder may have additional guidance on what can be shared and any approvals that are needed.
- Select a place to share information: As noted above, there are multiple locations available for sharing study information. Select an approach that works best for you and your institution. For example, the BCH team discussed Dataverse and OSF as potential online data sharing platforms that could host the study metadata.
- Upload the information: This step will take some dedicated time to ensure all study information is fully added to the selected place of interest. Hopefully, this step will move along quickly because you have already taken the time to write the information beforehand.
- Check the access permissions: After you’ve uploaded all the study information, please verify that the information is publicly accessible. If you’d like to keep some portions in a non-publicly accessible, but still online, format, please verify that this is the case. For example, the BCH team reviewed existing data sharing resources from NIEHS and drafted a data use agreement that could be used for future data use access requests.
- Copy down the link and share! Congratulations, you’ve written and posted study information in a way that others can view and access! Make sure you note the link that you’ve used, saved any login information in case you need to update the study information in the future, and be sure to share the link with others!
Following these steps will help you apply FAIR Principles to your research studies! Not every study can be shared in exactly the same way, and keeping the overall process in mind is helpful for improving adherence to FAIR Principles.
Written by John Obrycki, PhD, Research Operations and BiOS Freezer Core Manager, Harvard T.H. Chan School of Public Health, Department of Nutrition and former Staff Scientist, Boston Children's Hospital, Department of Neurology