As “big data” has moved from the margins to the center of a growing number of academic disciplines, how well are universities, funders, and publishers supporting researchers? To better understand how big data research is pursued in academic contexts, Ithaka S+R partnered with librarians at more than 20 colleges and universities, interviewing over 200 faculty members, to explore how researchers work with big data and identify the challenges they face. “Big Data Infrastructure at the Crossroads: Support Needs and Challenges for Universities,” published today, provides insights into the support needs of researchers working across a wide range of disciplines.

Given the importance of big data and the closely related issue of data sharing, the stakes involved in creating infrastructures capable of sustaining big data research are high. The findings from this project suggest that universities are meeting many current needs but also highlight systematic challenges that will need to be addressed as data-intensive research proliferates.

For one, big data research is resource intensive, in both obvious and less immediately apparent ways. Keeping up with the technological, human, and financial demands of data-intensive research is a core strategic challenge facing research universities. Sustaining the computing infrastructure required to store, share, and analyze large datasets is a major financial and technical challenge. As data-driven research has proliferated, universities have invested heavily in research data services hosted by a wide range of campus units, creating an extensive human and service infrastructure to support big data research. These are far from the only labor costs associated with big data research, which depends on formal and informal collaboration between students, postdocs, research staff, faculty, IT and information professionals, librarians, as well as legal offices, IRBs, and other university offices. Though sometimes overshadowed by technological costs, the aggregate labor expenditures that make big data research possible are considerable.

The labor and costs of working with big data are not the only challenges. In many fields, promotion and tenure standards around data sharing have not kept up. Additionally, some researchers expressed concerns that IRB regulations are not well adapted to new or evolving research methods, and uncertainty about best practices for ethical research may impede open data sharing.

As big data continues to grow, the difficulty of supporting the research mission of universities—already a substantial challenge for administrators—will increase. Making big data sustainable will require coordinated action by universities, something that is difficult to accomplish at institutions with decentralized bureaucracies and cultures. Unsurprisingly, on many campuses, the infrastructure to support big data research is highly fragmented, creating barriers to economies of scale and to the efficacy of support provided by departments and research centers, libraries, computing centers, and IT and information professionals. These challenges are exacerbated by the fact that supporting big data research on campus requires coordination with actors beyond the university but central to the research ecosystem.

To this end, our report concludes with a series of actionable recommendations. We anticipate that the findings will be useful to university research officers, libraries, computing centers, IT and information professionals, and faculty and staff who engage in big data research as well as publishers, funders, and others with stakes in research infrastructures. 

“Big Data Infrastructure at the Crossroads” is part of a series of Ithaka S+R research projects exploring data intensive research communities and instructional practices with data. Early in 2022 we will publish a  report on Teaching with Data in the Social Sciences based on the findings from another cohort-based project.

Participating institutions

Participating institutions published local reports based on interviews with their faculty. When available, these reports are linked below.

Atlanta University Center Consortium
Bryan Briones, Justin De La Cruz, Rosaline Odom, “ITHAKA S+R Supporting Big Data Research Project.”

Boston University
Paula Carey, Kate Silfen, “Supporting Big Data Research at Boston University Report: A Study Conducted in Partnership with Ithaka S+R.”

Carnegie Mellon University
Neelam Bharti, Patrick Campbell, Hannah Gunderman, Huajin Wang, “Understanding the Research Practice and Service Needs of Big Data Researchers at Carnegie Mellon University Report,”

Case Western Reserve University
E.M. Dragowsky, Ben Gorham, Jen Green, Roger Zender, Lee Zickel, “Supporting Big Data Research at Case Western Reserve University: An Ithaka S+R Local Report,”

Georgia State University
Kelsey Jordan, Bryan Sinclair, Mandy Swygart-Hobaugh, Jeremy Walker, “Supporting ‘Big Data’ Research at Georgia State University,”

New York University
Vicky Rampin, Margaret Smith, Katie Wissel, Nicholas Wolf, “Supporting Big Data Research at New York University,”

North Carolina A&T State University
Tracie Lewis, David Rachlin, Iyanna Sims, “Supporting Big Data Research at North Carolina Agricultural and Technical State University: An Ithaka S+R Local Report.”

North Carolina State University
Karen Ciccone, Susan Ivey, John Vickery, “Big Data Research Practices and Needs at North Carolina State University: An Ithaka S+R Local Report,”

Northeastern University
Jen Ferguson, Kate Kryder, James Macalino, Julia Unis, “Big Data and Data Science Research at Northeastern University – Final Report,”

Pennsylvania State University
Seth Erickson, Lana Munip, Cynthia Vitale, Cindy Xuying Xin, “Big Data Research Support at the Penn State University,”

Temple University
Will Dean, Fred Rowland, Adam Shambaugh, Gretchen Sneff, “Supporting Big Data Research at Temple University,”

Texas A&M University, College Station
Carolyn Jackson, Laura Sare, Paria Tajallipour, John Watts, “Assessing the Research Practices of Big Data and Data Science Researchers at Texas A&M: An Ithaka S+R Local Report,”

University of California, Berkeley
Erin D. Foster, Ann Glusker, Brian Quigley, “Supporting Big Data Research at the University of California, Berkeley: An Ithaka S+R Local Report,”

University of California, San Diego
Stephanie Labou, David Minor, Reid Otsuji, “UC San Diego Ithaka S+R Research Study: Supporting Big Data Research,”

University of Colorado Boulder
Emily Dommermuth, Cindy Edgar, Nickoal Eichmann-Kalwara, Rebecca Kuglitsch, Andy Monaghan. Report forthcoming 2022.

University of Illinois, Urbana-Champaign
Carissa Phillips, Chris Wiley, Jen-Chien Yu, “Ithaka S+R Supporting Big Data Research University of Illinois at Champaign-Urbana Report.”

University of Massachusetts, Amherst
Thea P. Atwood, Melanie Radik, Rebecca M. Seifried, “Supporting Big Data Research at the University of Massachusetts Amherst,”

University of Oklahoma
Claire Curry, Zenobie S. Garrett, Mark Laufersweiler, Tyler Pearson, “OU Libraries Support of Big Data.”

University of Rochester
Daniel Castillo, Moriana Garcia, Sara Pugachev, Sarah Siddiqui, “Supporting Big Data Research at the University of Rochester: An Ithaka S+R Local Report,”

University of Virginia
Jacalyn Huband, Jennifer Huck, “Assessing the Research Practices of Big Data and Data Science Researchers at the University of Virginia: An Ithaka S+R Local Report,”

University of Wisconsin, Madison
Cameron Cook, Tom Durkin, Tobin Magle, Jennifer Patiño, “ITHAKA: Supporting Big Data Research, Data and Analysis from UW-Madison Researchers,”