Welcome to Issue 8 of the RDM Weekly Newsletter!
The content of this newsletter is divided into 3 categories:
☑️ What’s New in RDM?
These are resources that have come out within the last year or so
☑️ Oldies but Goodies
These are resources that came out over a year ago but continue to be excellent ones to refer to as needed
☑️ Just for Fun
A data management meme or other funny data management content
What’s New in RDM?
Resources from the past year
1. Webinar - Navigating Open Research: A guide for Early Career Researchers
On Thursday, September 4th at 12pm Ireland/UK time, the UCC Library will be discussing the launch of Navigating Open Research, a new national guide designed to support Early Career Researchers in Ireland. Developed through a collaborative sprint process, this open-access guide offers practical, step-by-step support for embedding Open Research practices throughout the research lifecycle. In this webinar, you’ll learn about the co-creation process, explore the guide’s key features, and discover how it can be used across the research landscape to support engagement with open research.
2. Retrospective Clinical Data Harmonisation Reporting Using R and Quarto
There has been an increase of projects that involve data pooling from multiple sources. This is because combining data is an economical way to increase the statistical power of an analysis of a rare outcome that could not be addressed using data from a single project. Prior to statistical or machine learning analysis, a data steward must be able to sort through these heterogeneous inputs and document this process in a coherent way for different stakeholders. Despite its importance in the big data environment, there are limited resources on how to document this process in a structured, efficient and robust way. This UseR 2025 presentation from Jeremy Selva provides an overview on how they created clinical data harmonisation reports using R packages and a Quarto book project. A recording from Jeremy’s similar talk at R Medicine can be found here.
3. Six Questions to Ask Before Jumping Into a Spreadsheet
Ask a bunch of scientists whether they use spreadsheets in their work and you’re bound to touch a nerve. Many have sworn off spreadsheets, others swear by them and some swear profusely when they’re forced to use them. Spreadsheets are broadly accessible, but can cause headaches for the unwary. This article provides six questions you can ask yourself next time you think about using a spreadsheet for research, that will help make them more effective.
4. Digging Deeper into Common Data Elements at NIH by Using Generative AI
In this NIH blog post, the authors discuss the need to create minimal sets of consistent and computable Common Data Elements (CDEs) that support data integration across studies and repositories to reduce headaches associated with harmonizing data. They then detail how the Office of Data Science Strategy used GenAI to conduct a landscape analysis of NIH’s existing efforts to promote the adoption and usage of CDEs across the entire research ecosystem. Last, they summarize their findings.
5. Recommendations on Data Versioning
We often say that “A is a version of B” but do not explain what we mean by “version”. The Research Data Alliance Data Versioning Working Group collected over forty use cases of versioning practices for data and software and published a set of principles distilled from the group's analysis of them. This document aims to translate the Principles into actionable recommendations for data versioning.
6. Can I Have Your Data? Recommendations and Practical Tips for Sharing Neuroimaging Data Upon a Direct Personal Request
This paper acknowledges that while more repositories have been developed that allow researchers to securely share neuroimaging data, a significant portion of data sharing in the field still takes place through direct communication between researchers. The authors note the challenges faced when sharing data through personal requests, and aim to help researchers navigate these challenges by exploring core considerations throughout the data-sharing process. The article draws on practical insights from an ongoing case study led by one of the authors (MN), which involved extensive sharing of neuroimaging datasets from multiple research groups and institutions through direct personal requests. I think many of the insights from this paper are also applicable to researchers in other fields who may be sharing data through personal requests.
Oldies but Goodies
Older resources that are still helpful
1. Developing a Modern Data Workflow for Regularly Updated Data
Over the past decade, biology has undergone a data revolution in how researchers collect data and the amount of data being collected. An emerging challenge that has received limited attention in biology is managing, working with, and providing access to data under continual active collection. Regularly updated data present unique challenges in quality assurance and control, data publication, archiving, and reproducibility. This paper provides an overview of a workflow for a long-term ecological study that addresses many of the challenges associated with managing this type of data. The workflow leverages existing tools to 1) perform quality assurance and control; 2) import, restructure, version, and archive data; 3) rapidly publish new data in ways that ensure appropriate credit to all contributors; and 4) automate most steps in the data pipeline to reduce the time and effort required by researchers.
2. Making the Right Moves: A Practical Guide to Scientific Management for Postdocs and New Faculty
While this guide from the Howard Hughes Medical Institute and Burroughs Wellcome Fund is almost 20 years old at this point, much of the material is still relevant for academics today, particularly those in medical science fields. This book provides helpful guidance for early career academics, especially for those looking to start their own labs. The chapters on project management and data management provide helpful tips on how to create workflows that lead to better data.
3. Best Practice for R :: Cheat Sheet
An opinionated R cheat sheet developed by Jacob Scott, this sheet is intended to be a quick primer for developers new to R who have an interest in doing things well. A similar version of this cheat sheet was originally created specifically for use in the UK Department for Education. This is a slightly more generalized version.
4. Easing Into Open Science: A Guide for Graduate Students and Their Advisors
This article provides a roadmap to assist graduate students and their advisors to engage in open science practices. The authors suggest eight open science practices that novice graduate students could begin adopting today. The topics covered include journal clubs, project workflow, preprints, reproducible code, data sharing, transparent writing, preregistration, and registered reports. To address concerns about not knowing how to engage in open science practices, the authors provide a difficulty rating of each behavior (easy, medium, difficult), presents them in order of suggested adoption, and follows the format of what, why, how, and worries.
Just for Fun
Thank you for checking out the RDM Weekly Newsletter! If you enjoy this content, please like, comment, or share this post! You can also support this work through Buy Me A Coffee.