Data sharing is essential for expedited translation of research results into knowledge, products and procedures to improve human health.
In the last decade, it has become increasingly common for researchers to make their data available to others when they complete a study. This is usually referred to as data sharing or data publishing. Data sharing is growing mostly due to recent data policies from journals and funders. So, why share?
- Be compliant with research funding organizations that require data management plans and data accessibility
- Be compliant with journals that require submission of supporting data files to accompany manuscripts
- Find your own data years after you finish a project
- Enable others to replicate your work
- Enable others to conduct new analyses using your data
- Data citation is becoming a standard across publishers, and standardized data repositories generate a data citation when you deposit your data. Thus, sharing your data in a repository results in credit for your work.
- To create incentives for data sharing, some are developing and advocating for tools that track sharing of data and that formally credit investigators that share data:
How can I maximize my data's reuse?
- Share data and code in open trusted repositories
- Use persistent links from publication to data and code
- Citation to data and code should be a standard
- Document data, code, workflows, and computational environment
- Use open license for your code and data
- Make use of a data provenance tool
What is reproducibility and why does it matter?
Reproducibility and Replication (National Science Foundation) (see Tips for Reproducibility)
- The ability for a researcher to replicate the results of a prior study using the same materials and procedures used by the original investigator (reproducibility)
- same procedures are followed but new data are collected (replication)
Empirical, Computational, Statistical Reproducibility (Stodden, 2014)
- Empirical: data and collection details are made freely available
- Computational: code, software, hardware, and implementations details are provided
- Statistical: details on choice of statistics tests, model parameters are provided
What You Need to Know
Many journals require that published articles be accompanied by the underlying research data. Data sharing policies often are found in the instructions for authors.We can help you interpret your journal’s data sharing policy, and if your journal doesn’t specify where and how you should share your data, we can help you find a data repository.
Repositories can help you:
- manage your data
- cite your data by supplying a persistent identifier
- facilitate discovery of your data
- preserve your data for the long-run
- HMS regulations: Harvard Longwood Area researchers conducting human subject research should consult the Office of Human Research Administration (OHRA) and The Institutional Review Board Operations (IRB) in planning for data management and sharing
- Health research regulations: Researchers need to adhere to privacy law regarding personal health information. See the The Health Insurance Portability and Accountability Act of 1996 (HIPAA)
- Informed consent: Researchers should include a provision for data sharing.
- Maintaining confidentiality: Data made publicly available should not contain information that could risk the confidentiality of their participants.
Sharing data that you have produced or collected yourself:
- Data is not copyrightable. Particular expressions of data, such as a table in a book, can be copyrightable.
- Promote sharing and unlimited use of your data by making it available under an Open Data Commons or Creative Commons license.
Sharing data that you have collected from other sources:
- Licensed data can have restrictions in the way it can be used or shared downstream.
Data Use Agreements:
A Data Use Agreement (DUA) should be used when transmitting or receiving any data and there is a need to control the use, transfer, storage, and/or disclosure of the data. For example, a DUA would be required when transferring human subjects data, even it is de-identified, to ensure compliance with the signed consent forms and that appropriate information security measures are in place at the receiving institution. The Office for Sponsored Programs and Office of Research Administration are the authorized DUA signatories for Harvard.
NIH Data Sharing Policy & Public Access Policy:
The Final NIH Statement on Sharing Research Data was published in the NIH Guide on February 26, 2003. This is an extension of NIH policy on sharing research resources, and reaffirms NIH support for the concept of data sharing. The new policy becomes effective with the October 1, 2003 receipt date for applications or proposals to NIH. See more under Open Access.