RDM Weekly - Issue 025
A weekly roundup of Research Data Management resources.
Welcome to Issue 25 of the RDM Weekly Newsletter!
The content of this newsletter is divided into 3 categories:
✅ What’s New in RDM?
These are resources that have come out within the last year or so
✅ Oldies but Goodies
These are resources that came out over a year ago but continue to be excellent ones to refer to as needed
✅ Just for Fun
A data management meme or other funny data management content
What’s New in RDM?
Resources from the past year
1. A Brief Glossary of Terms about Repeatability: Replicability, Robustness, and Reproducibility
This short paper, from authors at the Center for Open Science and the University of Virginia, provides a brief glossary drawing on emerging norms for how reproducibility-related terms are defined. The authors suggest improved precision and a new term to improve clarity and avoid dual use. Specifically, they affirm the emerging standard definitions for reproducibility, robustness, and replicability, introduce the term repeatability, and clarify the use of credibility and trustworthiness in this context.
2. Make Your Open Data More Useable
Making open data more useful for yourself and others may be easier than you think. This short blog post provides some quick wins you can employ with every open dataset you publish.
3. A Modern Guide to SQL JOINs
In this in-progress SQL joins tutorial, the author structures the material with the goal of clarifying your mental model. The article includes several helpful examples throughout and includes a link to a SQL database playground for practice.
4. Data Management in Large-Scale Studies: A Handbook for Researchers
Large-scale educational research plays a critical role in understanding the multifaceted and systemic nature of education. It is instrumental in monitoring trends, identifying diverse patterns, and informing policy and practice through evidence-based insights. Such studies require rigorous data collection and management systems to uphold data quality, ensure traceability, and maintain the integrity of both quantitative and qualitative datasets. This handbook, from authors at Aga Khan University, offers practical, step-by-step guidance for managing data across all stages of the research process, from planning to collection and analysis. These steps have been developed and field-tested while carrying out six different large-scale studies.
5. Guidance for Research Tools
The AskResearch team at McMaster University has launched a new series of digital research guides designed to help the University community navigate essential tools, platforms and practices that support today’s digital research environment. Developed in response to recurring questions and complex issues faced by researchers, the guides focus on key digital research topics, including privacy and security in online surveys, limiting bot responses, selecting electronic lab notebooks, and using transcription software effectively. While these guides were developed for the McMaster community, they have lots of content that would be beneficial to researchers anywhere.
6. PIDs 101 - Webinar
This webinar from SPARC provides a broad introduction to persistent identifiers (PIDs), including community definitions, commonly used persistent identifiers, and examples of how PIDs are used to develop more open systems of research. The speaker discusses the benefits that can be realized by using persistent identifiers throughout the research lifecycle and how PIDs can contribute to developing new pathways for scientific communication.
7. SPORR CTS Pilot Grants Program at Stanford
The Stanford Clinical & Translational Science (CTS) Pilot Grants Program supports projects that advance the science of translation, emphasizing collaborative, transdisciplinary work and generalizable translational outcomes. Awards are up to $20,000 for 5 months. The 2026 call prioritizes Research Rigor and Reproducibility projects. Eligible activities include establishing or enhancing things such as lab manuals, data pipelines, or projects that create or improve rigor and reproducibility services or education. Although this funding is only available for those affiliated with Stanford, it is an exciting model I hope to see at other institutions. Thanks to John Borghi for sharing this.
Oldies but Goodies
Older resources that are still helpful
1. Lab Manuals for Efficient and High Quality Science in a Happy and Safe Work Environment
The culture and norms of research laboratories (including the norms around data management work) can greatly influence the quality of scientific outputs produced by the lab and the well-being of its members. In most labs, the culture that governs how individual researchers behave in their work is often implicit and situational. In contrast, some labs develop their own lab manual, a written document that outlines their standard functioning. Such a lab manual can ensure that lab members have a shared and mutual understanding of who they are and how they do things. However, developing a lab manual without a doubt requires extra effort. To decrease researchers’ burden of creating their own lab manual, the authors of this preprint provide a lab manual template that outlines the different topics that could be included in a manual. Moreover, they introduce an easy-to-use web app (blueprint), that further simplifies the process of creating a lab manual using their template. This paper aims to serve as a guide for those who would like to partake in this endeavor.
2. Cleaning Medical Data with R - Recording
In this 2023 R Medicine workshop, participants learned how to import messy data from an Excel spreadsheet and developed the R skills to turn this mess into tidy data ready for analysis. Participants learned (1) how to import Excel files with (common, messy) data problems; (2) to address and clean common messy data problems in each variable; and (3) to address and clean data with more complex meta-problems, like pivoting to long format for data analysis, dealing with multi-column headers, color-coded data (gah!), and un-pivoting pivot tables into tidy data. Slides and other materials from the workshop can be found at this site.
3. Data Management Handbook for Human Subjects Research
This 2023 guide from Alena Filip at San Jose State University, is one of the best resources I’ve come across for understanding things like confidentiality, data privacy, data security, data sensitivity, data ownership and more. This document is an educational resource intended to help researchers, who want to conduct research with human participants, construct an effective data management plan as part of their research proposals. It contains a helpful glossary, useful diagrams and tables, and even data management templates.
4. Registry of Research Data Repositories
This site is a global registry of research data repositories from all academic disciplines. It allows anyone to search over 3000 repositories from around the world to identify a suitable repository for their data that complies with their necessary requirements.
Just for Fun
Last but not least, a huge thank you to Christian Lindke for nominating me/RDM Weekly for a Sunshine Blogger Award! If you don’t know what that award is (which I did not until receiving this nomination), it is peer recognition for bloggers who have positively impacted you. You can read more about this award by checking out Christian’s write-up in a recent issue of his Geekerati Newsletter which includes discussion of films, TV shows, role playing games, and Science Fiction and Fantasy literature.
And as always, if you enjoy RDM Weekly, please like, comment, or share this post! You can also support this work through Buy Me A Coffee.



