RDM Weekly - Issue 047
A weekly roundup of Research Data Management resources.
Welcome to Issue 47 of the RDM Weekly Newsletter!
The content of this newsletter is divided into 4 categories:
✅ What’s New in RDM?
These are resources that have come out within the last year or so
✅ Oldies but Goodies
These are resources that came out over a year ago but continue to be excellent ones to refer to as needed
✅ Research Data Management Job Opportunities
Research data management related job opportunities that I have come across in the past week
✅ Just for Fun
A data management meme or other funny data management content
What’s New in RDM?
Resources from the past year
1. How to Get Started Using AI with R
There are so many ways to use AI with R now that it’s genuinely hard to know where to start. Should you just use ChatGPT in a browser tab? Install something into your code editor? And if so, which one? In this video, David Keyes walks through the main options, with the pros and cons of each.
2. 2026 UC Open Research Day Materials
This repository contains materials from the University of Cincinnati’s Open Research Day. Presentation slide decks include topics such as “Beyond Data Available Upon Request: Closing the Gap in Open Data”, “Open by Design: Scaling and Sustaining Open at CMU”, “Science Communication: Sharing Your Research with the Public”, and “Hitting the Open Road: A Roadmap for Community Centered Dissemination Planning”, as well as slides for several lightning talks.
3. Lab Handbook Template
Research labs routinely face challenges related to onboarding, role clarity, expectations, and continuity. This template, from University of
Illinois Urbana-Champaign, provides a starting place for consistently communicating expectations to all lab members. Drawing on an extensive review of existing lab handbooks and related literature, the template emphasizes policies and expectations rather than task-specific procedures and adopts a modular, “à la carte” structure that can be adapted to diverse lab contexts.
4. Community-Driven Guidance: Help Shape Trustworthy Data Repository Characteristics for Canada
The Digital Research Alliance of Canada’s Data Repositories Expert Group (DREG) has struck a working group to develop guidance to help researchers select data repositories in fulfillment of the Tri-Agency Research Data Management Policy data deposit requirement, anticipated to take effect in early 2027. The working group is conducting a literature review to identify characteristics across technical, operational, and organizational dimensions — including persistent identifiers, preservation, metadata, cybersecurity, data handling ethics, and Indigenous community engagement — with particular attention to Canadian priorities such as data sovereignty, bilingualism, and Tri-Agency alignment. Findings will inform a guidance document and complementary decision-making resources. On Wednesday, June 24 from 1:00–2:30 PM ET, they are hosting a 90-minute community consultation webinar to share preliminary results from the literature review and gather attendee input on the characteristics identified.
5. Test-Driven Data Analysis
This book is for anyone working with data who is interested in producing more robust and reliable results. The goal of this book is to reduce the frequency and severity of errors in all forms of data processing and analysis, from the simplest query to the most complex machine-learning data pipeline. Test-driven data analysis (TDDA) is both a methodology for improving data quality and data pipelines, and a set of software tools—the Python tdda library—for helping to implement that methodology. The library is written in Python and presents a Python API, so is most relevant to people using Python, but it also provides a command-line interface. The author will be making the entire book available for free, releasing a chapter a week over 16 weeks. Chapters 1 and 2 are available now.
6. How to Cite Data Resources in Wikipedia
A data citation is a short, formatted description that provides the information necessary to retrieve a published dataset. Consistent and reliable citations are an important part of making research open and reusable. In this blog post, the DRI Labs team provides guidance on how metadata describing Repository datasets can be used to generate a well-formatted Wikipedia citation.
7. Skills for the Curation of Sensitive Data
This report, from Health Data Research UK, provides an evidence-based snapshot of current skills gaps, training provision, and barriers in sensitive data curation. It draws on surveys, focus groups, and interviews with curators, TRE/SDE/Safe Haven operators, governance specialists, and training providers across multiple sectors and data types. The accompanying recommendations for key stakeholders set out a roadmap to improve skills and strengthen recognition in sensitive data curation.
Oldies but Goodies
Older resources that are still helpful
1. On the Reuse of Scientific Data
While science policy promotes data sharing and open data, these are not ends in themselves. Arguments for data sharing are to reproduce research, to make public assets available to the public, to leverage investments in research, and to advance research and innovation. To achieve these expected benefits of data sharing, data must actually be reused by others. Data sharing practices, especially motivations and incentives, have received far more study than has data reuse, perhaps because of the array of contested concepts on which reuse rests and the disparate contexts in which it occurs. In this essay the authors explicate concepts of data, sharing, and open data as a means to examine data reuse. They explore distinctions between use and reuse of data. Lastly they propose six research questions on data reuse worthy of pursuit by the community.
2. Open with Care: Indigenous Researchers and Communities are Reshaping How Western Science Thinks About Data Ownership
This 2024 Science article follows UC Berkeley Ph.D. student Leslie “Leke” Hutchins, a Native Hawaiian researcher who redacted species names and location data from his arthropod study after determining the information was culturally sensitive to the Indigenous farmers on whose land it was collected. The article explores the broader tension between academia's push for open data and the growing Indigenous data sovereignty movement, guided by the CARE principles, which hold that Indigenous communities should have authority over how data about their lands, cultures, and bodies are collected and used. Institutions like NIH, NSF, and major publishers are beginning to adapt their policies in response, though Indigenous principles remain largely optional while open data mandates are increasingly required.
3. Creating Unique Participant Study Identifiers
Assigning unique participant identifiers help protect participant privacy and confidentiality. Research teams use a variety of approaches to generate these identifiers, each with its own tradeoffs. This post reviews common methods and concludes with best practice recommendations.
4. Cornell Data Services Glossary
This glossary provides definitions for common research data management terms. This is not meant to be a comprehensive list.
Research Data Management Job Opportunities
These are data management job opportunities that I have seen posted in the last week. I have no affiliation with these organizations.
Just for Fun
Sponsor
This newsletter is supported in part by the Eunice Kennedy Shriver National Institute Of Child Health & Human Development of the National Institutes of Health under Award Number R25HD114368. The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health. Read more about the NIH Data Management for Data Sharing Workshop Project.
Thank you for reading! If you enjoy this content, please like, comment, or share this post! You can also support this work through Buy Me A Coffee.



