There is a growing perception that science can progress more quickly, more innovatively, and more rigorously when researchers share data with one another. Amid a growing array of organizations, initiatives, and policies working toward this vision, there is a pressing need to decide strategically on the best ways to move forward.

Central to this decision is the issue of scale. Is data sharing best assessed and supported on an international or national scale? By discipline? On a university-by-university basis? Or using another unit of analysis altogether? How do we design support for data sharing in order to align as closely as possible with the practices and interests of scholars, in order to maximize buy-in?

In a new issue brief, we build on our ongoing research into scholarly practices to propose a new mechanism for conceptualizing and supporting STEM research data sharing. Successful data sharing happens within data communities, formal or informal groups of scholars who share a certain type of data with each other, regardless of disciplinary boundaries. Drawing on Ithaka S+R findings and the scholarly literature, we identify what constitutes a data community and outline its most important features by studying three success stories, investigating the circumstances under which intensive data sharing is already happening.

Those who want to support data sharing in the sciences need to look for opportunities to grow data communities around scholars’ existing practices and interests. They can do this by identifying and supporting emergent data communities – groups of scholars who are interested in sharing and using a particular type of data, but don’t yet have the infrastructure they need to do so. We offer recommendations for how this ground-up cultivation should proceed, focusing on actionable steps for funders, professional societies, publishers, information technologists, digital curation experts, and academic libraries.