RDM Weekly - Issue 049
A weekly roundup of Research Data Management resources.
Welcome to Issue 49 of the RDM Weekly Newsletter!
The content of this newsletter is divided into 4 categories:
✅ What’s New in RDM?
These are resources that have come out within the last year or so
✅ Oldies but Goodies
These are resources that came out over a year ago but continue to be excellent ones to refer to as needed
✅ Research Data Management Job Opportunities
Research data management related job opportunities that I have come across in the past week
✅ Just for Fun
A data management meme or other funny data management content
What’s New in RDM?
Resources from the past year
1. Data Reuse is the Sincerest Form of Flattery
This article argues that existing post-publication assessment methods like commentary articles and article to article citations capture "excitingness" rather than the robustness of methods or validity of results. It proposes that "article to dataset to article" citations are a far more "trust dense" signal, since reusing someone's data requires deeply validating their work and staking your own reputation on it. Finally, the article notes that while detecting data reuse has historically been nearly impossible, new reasoning LLMs can now reconstruct these data citations, making this a viable quality metric at scale.
2. Sonraí, DANS & CaSDaR Masterclass - Advocacy Skills for Data Stewards
Join Sonraí, DANS, and CasDaR on Monday June 22nd for a unique masterclass focused on effectively influencing stakeholders and advocating for data stewardship. This practical 2-hour webinar will introduce participants to key techniques for communicating ideas more persuasively with senior stakeholders and non-expert audiences. Participants will learn how to structure a clear business case, use plain English to improve understanding, and adapt their communication style to build stakeholder understanding and buy in. The session will provide practical tips that can be applied immediately when presenting recommendations, proposals and initiatives. Participants will explore these techniques in the context of communicating specialist or technical information to wider stakeholder groups.
3. Improving Acknowledgments Sections to Better Credit Research Contributors
Formal recognition of research contributions is critical for career advancement and the allocation of research funding. However, some contributions are mentioned only in the acknowledgments section, which are not indexed by scholarly databases, resulting in little recognition for those involved. The authors contextualize this shortfall in terms of contributorship, the movement to recognize specific research contributions rather than rely solely on authorship. Broadening the range of recognized individuals is currently advanced largely through reducing authorship restrictions and unbundling manuscripts into smaller elements, such as datasets and protocols, that receive their own attributions. Here they focus on a complementary path, enhancing the contents and metadata of acknowledgments sections. Capitalizing on existing infrastructure and standards, the authors propose: 1) when acknowledging individuals, authors include ORCIDs (subject to the acknowledgees’ approval) and provide CRediT information where applicable; 2) publishers solicit identities of acknowledgees in a similar way to how they do so for authors in their submission portals; and 3) publishers include metadata of acknowledgees in JATS-XML files. Implementing these steps should encourage scholarly databases to index non-author contributors. The ensuing increase in visibility for research contributors, such as technicians and library professionals, should result in greater recognition of non-author roles.
4. Hugging Face Dataset Lineage Explorer
As Daniel van Strien discusses in his LinkedIn post, datasets on Hugging Face (HF) are often cleaned versions, translations, filtered subsets, or format conversions of something that already existed. You wouldn't see this from most of the metadata. The HF Hub has a source_datasets field in the dataset card YAML which almost nobody fills it in. To improve data lineage, Daniel built an explorer that infers the missing lineage by comparing datasets and allows you to browse the inferred lineage for any dataset.
5. R for the Rest of Us Podcast Episode 30: Sara Altman and Simon Couch
Most coding assistants treat data quality as a speed bump. You hand them a messy CSV, they squint at it, and a moment later they've shipped some plausible-looking analysis built on a quietly broken foundation. Posit's new AI agent takes a different stance: it pauses, asks you the awkward questions, and treats the messy bits as the point of the work, not an obstacle to it. In this recording, David Keyes sits down with Sara Altman and Simon Couch to discuss Posit Assistant.
6. Research Security and Open Science: The Odd Couple
The institutional and political push towards both open science and research security appears like an unhappy marriage, reliving the essential tension between secrecy and openness in scientific research. Yet, the authors of this article hold that it is possible – and much needed! – to navigate this fraught territory in order to integrate research security within open science. By enriching our understanding of openness, the authors hope to ease the tension between research security and open science and eventually push forward the values that originally guided the open science movement.
7. Better Code, Better Science
Making science more reproducible and transparent is key to improving public trust in science. Because science is increasingly a computational enterprise, improving the quality of scientific research code is essential to making science more reproducible. Increasingly, this code is being written with the help of AI assistants, which can increase productivity but introduces the potential for errors. This open access book will provide a thorough guide to using AI-assisted coding techniques to generate scientific code that is readable and robust and that can provide reproducible answers to scientific questions.
Oldies but Goodies
Older resources that are still helpful
1. Understanding Metadata: What is Metadata, and What is it For?: A Primer
Metadata, the information we create, store, and share to describe things, allows us to interact with these things to obtain the knowledge we need. This primer by Jenn Riley of McGill University Library offers a comprehensive overview of metadata, covering topics such as metadata types, standardization, and use in the cultural heritage sector and in the broader world. The Primer is accompanied by plentiful examples of metadata at work.
2. The Principles of Open Science Monitoring
To fully take advantage of the adoption of the 2021 UNESCO Recommendation on Open Science, transparent and representative monitoring must be put in place to drive and support the intended change. It is also vital to identify effective actions and priority gaps. To compensate for the lack of global guidelines on open science monitoring, the French Ministry of Higher Education and Research initially brought together a group of French experts (Université de Lorraine, Inria) to work on a proposal for common monitoring principles. This text served as a basis for a conference which gathered international experts at the Paris UNESCO headquarters in December 2023, leading to the Open Science Monitoring Initiative (OSMI). OSMI and UNESCO then conducted an international consultation to gather opinions from around the world to ensure that the Principles meet a variety of needs, approaches and contexts worldwide. These Principles focus on three key pillars: (1) relevance and significance, (2) transparency and reproducibility, and (3) self-assessment and responsible use.
3. Are Leadership and Management Essential for Good Research? An Interview Study of Genetic Researchers
Principal investigators are responsible for a myriad of leadership and management activities in their work. The practices they employ to navigate these responsibilities ultimately influence the quality and integrity of research. However, leadership and management roles in research have received scant empirical examination. Semi-structured interviews with 32 National Institutes of Health (NIH)-funded genetic researchers revealed that they considered leadership and management essential for effective research, but their scientific training inadequately prepared them. The authors also report management practices that the researchers described employing in their labs, as well as their perceptions of a proposed intervention to enhance laboratory leadership. These findings suggest best practices for the research community, future directions for scientific training, and implications for research on leadership and management in science.
4. SQL Style Guide
These guidelines are designed to be compatible with Joe Celko’s SQL Programming Style book to make adoption for teams who have already read that book easier. This guide is a little more opinionated in some areas and in others a little more relaxed. It is certainly more succinct where Celko’s book contains anecdotes and reasoning behind each rule as thoughtful prose.
Research Data Management Job Opportunities
These are data management job opportunities that I have seen posted in the last week. I have no affiliation with these organizations.
University of Illinois Chicago - Institutional Research Analyst
Wageningen University & Research - Research Engineer Data Management
Just for Fun

Sponsor
This newsletter is supported in part by the Eunice Kennedy Shriver National Institute Of Child Health & Human Development of the National Institutes of Health under Award Number R25HD114368. The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health. Read more about the NIH Data Management for Data Sharing Workshop Project.
🎉 Next week is the 1 year anniversary of this newsletter and I would love to reach 1,000 subscribers to celebrate this milestone! If you enjoy this newsletter, please share it with colleagues, friends, and anyone else who you think may benefit from reading it!
And as always, you can also support this work through Buy Me A Coffee.


