Preprints have been getting a lot of attention recently. Since the pandemic, dozens of articles have appeared in the scientific and popular press about both the role of preprints in accelerating scientific communications and the associated concerns, including in venues such as New York Times, Bloomberg, Economist, Mother Jones

Ten years ago when I became the program director of arXiv, which had already been in existence almost 10 years, my main mission was to create a sustainable business model for the service. It was the early years of the Great Recession, a period of economic decline that had an immediate and lasting impact on academic institutions. With declining budgets, academic libraries were seeking new ways to make scholarship accessible to their users. At the time, arXiv was seen as the poster child for the open access movement, called out as an example of where scholarly communication should be heading. 

Fast forward 10 years, and we are once again on the threshold of another economic recession. Numerous open access models have emerged since the last economic downturn, but preprints continue to play an important role. Today, almost 70 platforms are branding themselves as preprint services. The brief I’m publishing today provides an overview of the current preprint landscape, describing the rapid changes to how they are both perceived and utilized and the challenges they face. 

Ten years ago, the inspiration for the arXiv’s community-based business model came from Raym Crow’s models for supporting OA organizations and the case studies that Ithaka S+R conducted  to analyze the steps taken by nascent OA and OS initiatives to maintain successful and durable operations. Cornell believed that as a public good, arXiv should be supported by those institutions that use it the most. In 2010, arXiv already had 630,000 e-prints that were used by hundreds of thousands of researchers from all over the world with an operating budget of $400,000 per year. 

As I stepped down from my program director role in August 2019, arXiv included about 1.6 million papers with a budget of $2 million and millions of users. Although my initial efforts focused on creating a transparent and durable business model, we started feeling the pressure to shift our efforts to renewing the aging technical architecture and scaling the moderation system that were initially skillfully designed by its founder Paul Ginsparg. For me the key takeaway is that beyond being monolithic information systems, preprint services are publishing enterprises with complicated curatorial, governance, technology, interoperability, and resource requirements that necessitate robust technical architectures and business strategies. 

The concept of scholarly record is broadening. There is an increasing emphasis on sharing various outputs from the initial investigation to the final dissemination stage. Understanding how ideas evolve into knowledge in a systematic and timely manner is critically important as we promote  replicability and transparency. The COVID-19 pandemic has further underscored the importance of speedy sharing of research results. It accentuated the role of preprints in sharing results with speed, but also compounded the questions about accuracy, misconduct, and our reliance on the “self-correcting” nature of the scientific enterprise. As scientists and health care professionals, as well as the general public, look for information about the pandemic, preprint services are growing in importance. I hope that this brief will lead to further reflections on such critical questions. I look forward to hearing your comments about the future of preprints and its role in bringing transparency to various stages in the research lifecycle.