RDM Weekly - Issue 020
A weekly roundup of Research Data Management resources.
Welcome to Issue 20 of the RDM Weekly Newsletter!
The content of this newsletter is divided into 3 categories:
✅ What’s New in RDM?
These are resources that have come out within the last year or so
✅ Oldies but Goodies
These are resources that came out over a year ago but continue to be excellent ones to refer to as needed
✅ Just for Fun
A data management meme or other funny data management content
What’s New in RDM?
Resources from the past year
1. Automate File Management in R With the {fs} Package
In this guest R for the Rest of Us blog post (and brief recording), Jadey Ryan talks us through how the {fs} package can help us organize directory structure chaos. Her guide will show you how to use {fs} to automate tedious tasks like creating, renaming, moving, and deleting files and folders with just a few lines of code. More recently, Jadey also gave a workshop on the {fs} package for R-Ladies St. Louis and the recording for that workshop can be found here, and the materials found here.
2. Reproducible Computing in Bioinformatics: Lessons from My Latest Talk
In this article, Tommy Tang turns his recent talk “Good Enough Practices for Reproducible Computing” into a blog post, sharing why reproducible computing matters, why it’s tricky, and some simple tips to make it happen. The post is straightforward—no fancy jargon without explanation.
3. Data Stewardship Decoded: Mapping Its Diverse Manifestations and Emerging Relevance at a Time of AI
Data stewardship has become a critical component of modern data governance, especially with the growing use of artificial intelligence (AI). Despite its increasing importance, the concept of data stewardship remains ambiguous and varies in its application. This paper explores four distinct manifestations of data stewardship to clarify its emerging position in the data governance landscape. These manifestations include a) data stewardship as a set of competencies and skills, b) a function or role within organizations, c) an intermediary organization facilitating collaborations, and d) a set of guiding principles. The paper subsequently outlines the core competencies required for effective data stewardship, explains the distinction between data stewards and Chief Data Officers (CDOs), and details the intermediary role of stewards in bridging gaps between data holders and external stakeholders. It also explores key principles aligned with the FAIR framework (Findable, Accessible, Interoperable, Reusable) and introduces the emerging principle of AI readiness to ensure data meets the ethical and technical requirements of AI systems. The paper concludes by identifying challenges and opportunities for advancing data stewardship, including the need for standardized definitions, capacity building efforts, and the creation of a professional association for data stewardship.
4. In Defence of Scientific Use Files
The Trusted Research Environment (TRE) has been the great success story of data access this century. By providing highly secure yet flexible access, the TRE has enabled research use of the most sensitive data. However, it is the Scientific Use File (SUF) that remains the workhorse of academic research. SUFs are files made available under licence to authorised users, to hold and analyse on their own organisational machines. In recent years there have been concerns about the future of the SUF. A methodological review of confidentiality protection suggested that new technologies and methods can reverse engineer any de-identification techniques. This paper re-examines this balance between risk and utility, focusing particularly on unrecorded benefits. The authors note that the public is generally much more supportive of making data available and less risk-averse than data professionals. However, there is a significant gap in evidence: most of the information about public attitudes comes from TRE research and public engagement on SUF research is needed in this area.
5. How to Start Your Own Code Club
This guide was born out of the SORTEE 2025 unconference session “How to Start Your Own Code Club”. During the event, participants came together to share their experiences, questions, challenges, and creative ideas about building communities of practice around coding. Many contributors were people already running a Code Club, others were planning to start one, and some simply joined out of curiosity to learn from others. The discussion materials were collected in a shared online document, later used as the foundation for this collaboratively written guide. What follows is a synthesis of collective contributions: participant statements, experiences, well-tested strategies, and answers to common challenges. The result is a guide created by the community, for the community. This guide aims to document that collective experience and present practical advice for anyone interested in cultivating a learning community around coding.
6. Bare Necessities of Data Management
There are so many data management practices that can help you better organize your project, yet a team’s ability to “do it all” is really limited by factors such as funding, timing, team size, and expertise. Therefore, it is important for teams to consider what practices are feasible as well as which ones will give them the largest return on investment. This blog post reviews a list of core practices that many teams can implement early on in a project that will lead to better data outcomes.
Oldies but Goodies
Older resources that are still helpful
1. Current State of Data Stewardship Tools in Life Science
In today’s data-centric landscape, effective data stewardship is critical for facilitating scientific research and innovation. This article provides an overview of essential tools and frameworks for modern data stewardship practices. Over 300 tools were analyzed in this study, assessing their utility, relevance to data stewardship, and applicability within the life sciences domain.
2. Guide to Social Science Data Preparation and Archiving
This 6th Edition guide from ICPSR is one of my favorite guides out there. It provides best practices throughout the data life cycle in a clear and concise manner. The guide begins by reviewing the importance of data sharing and introduces the data life cycle. It then reviews best practices for developing data management plans, data collection, data analysis, and archiving data.
3. Working with Dates and Times in R Using the lubridate Package
With daylight savings time ending in the U.S. last weekend, I thought it might be a good time to review working with dates and times. This 2017 article, from the University of Virginia Library, reviews how to use the {lubridate} package in R to format and work with dates and time.
4. Research Data Management Lifecycle Checklist
This checklist from Longwood Research Data Management serves as a reference to keep track of the elements that make up good research data management throughout the data lifecycle. Using the section prompts, you can identify gaps and communicate elements of data management to members of your research team.
Just for Fun
Thank you for checking out the RDM Weekly Newsletter! If you enjoy this content, please like, comment, or share this post! You can also support this work through Buy Me A Coffee.



