Welcome to Issue 12 of the RDM Weekly Newsletter!
If you are new to RDM Weekly, the content of this newsletter is divided into 3 categories:
☑️ What’s New in RDM?
These are resources that have come out within the last year or so
☑️ Oldies but Goodies
These are resources that came out over a year ago but continue to be excellent ones to refer to as needed
☑️ Just for Fun
A data management meme or other funny data management content
What’s New in RDM?
Resources from the past year
1. A Guide to Developing Harmonized Research Workflows in a Team Science Context
Large, interdisciplinary team science initiatives are increasingly leveraged to uncover novel insights into complex scientific problems. Such projects typically aim to produce large, harmonized datasets that can be analyzed to yield breakthrough discoveries using cutting-edge scientific methods. Successfully harmonizing and integrating datasets generated by different technologies and research groups is a considerable task, which requires an extensive supportive framework that is built by all members involved. Such a data harmonization framework includes a shared language to communicate across teams and disciplines, harmonized methods and protocols, (meta)data standards and common data elements, and the appropriate infrastructure to support the framework’s development and integration. In this perspective, the authors build on their collective experiences as part of the REstoring JOINt health and function to reduce pain (RE-JOIN) Consortium to provide guidance for developing research-centered data collection and analysis pipelines that enable downstream integrated analyses within and across diverse teams.
2. SHU Open Research Podcast
This podcast, from the Library Research Support team at Sheffield Hallam University, provides in depth discussions around open research. The podcast currently has almost 20 episodes. In the most recent episode from September 1st, guests discuss the theme of last week’s OpenFest 2025, Research and Equity, Equality, Diversity, and Inclusion. They discuss the importance of more robust and reliable data, REF and institutional strategies to strengthen our trust in science driven by varied perspectives, experiences, and skills.
3. Navigating Open Research - A Guide for Early Career Researchers
Access to comprehensive information about Open Research (OR) practices is often fragmented, inconsistent or misaligned across disciplines or national contexts. In 2024, the CONUL Research Group launched a project to create a resource that is reflective of the Irish research landscape. This guide is designed to help you at each stage of your research journey. From preparing your research project and discovering relevant resources to research data management and reproducibility, writing and publishing, sharing and publishing data, licensing your work and communicating your research, every chapter provides you with practical tips that can be implemented immediately.
4. There Must Be an Error Here! Experimental Evidence on Coding Errors’ Biases
Quantitative research relies heavily on coding, and coding errors are relatively common even in published research. In this paper, the authors examine whether individuals are more or less likely to check their code depending on the results they obtain. The authors test this hypothesis in a randomized experiment embedded in the recruitment process for research positions at a large international economic organization. In a coding task designed to assess candidates’ programming abilities, they randomize whether participants obtain an expected or unexpected result if they commit a simple coding error. The authors find that individuals are 20% more likely to detect coding errors when they lead to unexpected results. This asymmetry in error detection depending on the results they generate suggests that coding errors may lead to biased findings in scientific research. More discussion on this issue can be found in this article.
5. Shaping Responsible AI: Four New Recommendations from the AIDV Working Group
Recognising the urgent need for community-driven guidance, the Research Data Alliance AIDV Working Group has developed four newly endorsed recommendations. Each targets a key challenge in this evolving landscape, from informed consent to legal structures for federated analysis. Collectively, these recommendations aim to support researchers, policymakers, ethics committees, data stewards, and technologists in implementing responsible, transparent, and inclusive AI practices. This blog post presents the four AIDV WG Recommendations, the challenges they are designed to tackle, and the impact they will have on the research data community.
6. The Beginner’s Guide to Web Scraping in Python: From Zero to Web Data Hero
In this blog post, Steven Paul Sanderson provides background on what web scraping is and why you might need it. He then lays out the web scraping ecosystem for Python and walks through how to set up your environment, and setting up your first script. Last, he walks through how to tackle common challenges and discusses best practices and ethical considerations.
Oldies but Goodies
Older resources that are still helpful
1. Ten Simple Rules for Digital Data Storage
While much has been written about both the virtues of data sharing and the best practices to do so, data storage has received comparatively less attention. Proper storage is a prerequisite to sharing, and indeed inadequate storage contributes to the phenomenon of data decay or to “data entropy,” in which data, whether publicly shared or not, becomes less accessible through time. Best practices for data storage often begin and end with this statement: “Deposit your data in a community standard repository.” However, data storage policies are highly variable between repositories. A data management plan utilizing best practices across all stages of the data life cycle will facilitate transition from local storage to repository. Similarly, having such a plan can facilitate transition from repository to repository if funding runs out or requirements change. This article describes ten simple rules for digital data storage that grew out of a long discussion among instructors for the Software and Data Carpentry initiatives.
2. PSPP for Beginners
This website is a tutorial to help people get started with using PSPP for data management and statistical analyses. PSPP is a free replacement for the proprietary program SPSS, and appears very similar to it with a few exceptions. The tutorial provides guidance beginning with downloading the software all the way to wrangling data and inferential statistics.
3. Qualitative Data Repository
The Qualitative Data Repository (QDR) curates, stores, preserves, publishes, and enables the download of digital data generated through qualitative and multi-method research in the social sciences. The repository develops and disseminates guidance for managing, sharing, citing, and reusing qualitative data, and contributes to the generation of common standards for doing so. QDR’s overarching goals are to make sharing qualitative data customary in the social sciences, to broaden access to social science data, and to strengthen qualitative and multi-method research. QDR is hosted by the Center for Qualitative and Multi-Method Inquiry, a unit of the Maxwell School of Citizenship and Public Affairs at Syracuse University.
4. Data Validation in R
Although data cleaning is a frequent topic of conversation (and commiseration) in the world of data science, data validation is discussed relatively less often. In this talk from the New York Open Statistical Programming Meetup, Caterina Constantinescu reviews some guiding principles, best practices and overall criteria to assess/ensure data validity. The talk also covers several R packages aimed at this precise topic, including: {validate}, {assertr} and {pointblank}, as well as other related packages or functions, with examples provided. Slides can be found here.
Just for Fun
Thank you for checking out the RDM Weekly Newsletter! If you enjoy this content, please like, comment, or share this post! You can also support this work through Buy Me A Coffee.