Building Proficiency in Data Management Skills through the Curriculum Fellow Program

Person teaching
Image credit: Harvard Medical School Curriculum Fellows Program
The RDM News Blog will occasionally spotlight data management advocates in our community; members working with data and supporting data management practices in various ways. This month we highlight Chris Magnano from the Center for Computational Biomedicine at Harvard Medical School. As a Curriculum Fellow, Chris creates and manages educational materials to help support researchers, building their data management skills and encouraging them to develop easy data management strategies.

Q: What is your role at Harvard University?

A: I am a Curriculum Fellow in the Center for Computational Biomedicine. The Harvard Medical School Curriculum Fellows Program (HMS CFP) trains early-career scientists to become science education leaders and practitioners.

Q: What is your research focus and what type(s) of research data do you work with?

A: I currently work as a Curriculum Fellow with the Center for Computational Biology. While this means I can work with biological datasets, everything from imaging data to biological networks, I mainly manage educational resources. This includes actual educational materials, and data we collect around those materials to evaluate their effectiveness and need at HMS.

Q: What are the major data management challenges (or successes) you see as a researcher or with those you support?

A: My role is supporting researchers to learn data management practices. I think one major challenge is researchers can feel after an introductory workshop or a training workshop on a data management tool or skill that they are not yet proficient enough for real research. We need to find better ways to support researchers in transitioning from beginner level to proficient in their data management skills.

Q: What are the costs and consequences of the gaps in data management you see? What are one or two things you could do to help mitigate them?

A: The way educational materials are stored and shared with other educators at many institutions is completely informal and ad-hoc; so many lessons, courses, and workshops end up getting lost. This results in a lot of time being spent reinventing the wheel, and we lose previous versions of materials that could help us understand how to teach more effectively. This is especially true for the types of workshops and trainings I help design. We can find examples of better practices, such as the Harvard Chan Bioinformatics Core, which retain their educational materials on GitHub; in many environments, these practices are not standard.

Q: What is your advice for someone just getting started with data management? Do you have a "data management mantra?”

A: We have ethical and legal obligations which we need to fulfill, but beyond meeting those I think it's important to acknowledge the human factor when considering data management. A sufficient data management plan is one you know you will follow through on, which is better than a perfect plan you won't have time for. In the same vein, usability is an important factor to consider in data management. Even if we know something is important, if it's too annoying to do, there's a good chance we'll eventually stop doing it.

Q: How do you support and promote data management in your work?

A: When designing educational materials, I always try to incorporate proper data management practices. I think it's important to model these practices and create an environment where good data management is the expectation from the get-go. Taking the time to use version control or create reproducible environments for small pieces of code or example datasets helps demonstrate its important.

Contributed by Chris Magnano, Curriculum Fellow, Center for Computational Biomedicine

If you are interested in being featured in a future blog post, please respond to our easy-to-fill-out form.