Responsible Use of Student Data
For several months, my colleagues and I at Ithaka S+R have been working with Mitchell Stevens, a professor at Stanford, on a project addressing the uses, challenges and opportunities for colleges and universities undertaking new forms of research, application, and representation of student administrative and learning data. Students’ increasing interaction with learning management systems, instructional technology, and administrative platforms is creating reams of new data about their learning behaviors and outcomes, and other experiences in school. And rapidly developing data mining techniques and computing power are allowing researchers, administrators, instructors—and for-profit companies—to use these data in powerful and novel ways.
We are particularly concerned with the concept of “responsible use” in this new world—what does it mean for this mass of records of student behaviors and outcomes to be deployed virtuously, to the benefit of students’ learning and self-efficacy, while respecting students’ privacy and autonomy? And how can institutions and the scientific community, more generally, ensure such responsible use?
Stevens and I joined Kent Wada of UCLA and Marco Molinaro of UC Davis on a panel about these topics at the recent Stanford Learning Summit. Stevens, Wada, and Molinaro each presented a thoughtful explanation and reflection on the work of their institution. My role was to summarize—in a few minutes—the national picture of “responsible use.” I attempted to do so with two observations about the state of student data use in the field and three comments on pressure points in the idea of responsible use.
There is great diversity in use of student data
It can be easy to forget in the bubble of higher education innovation circles, but many, maybe most, institutions of higher education are not systematically using their student data to improve instruction and support. While there is experimentation around the edges, we are not yet seeing a large-scale organized effort that impacts large numbers of students.
A recent KPMG survey of a sample of administrators found that just 41 percent of institutions represented were using student data for predictive analytics and forecasting, as might be done to create an early alert system for advisors or students themselves. While a majority reported having data available that are amenable to such analysis, just 29 percent have internal capacity to analyze their own data.
The story is similar when it comes to employing analysis of student data to improve instruction. According to Ithaka S+R’s periodic US Faculty Survey, depending how you ask the question, half or just under half of faculty respondents are using any form of technology in instruction, although 63% want to do so. The major stumbling block seems to be a lack of incentive: only 29% of faculty feel that they would be rewarded or recognized for modifying their pedagogy.
Cross-institutional collaboration on analytics is increasingly important
Still, a large number of institutions are making use of student data. And increasingly, they are doing so in the context of collaborative endeavors with other institutions.
Organizations like the PAR Framework, LearnSphere, and networks that have emerged around vendors like Civitas Learning, are pooling data and building shared infrastructure around analytics.
The members of other collaborations, like Achieving the Dream, Complete College America, the University Innovation Alliance, the Council of Independent Colleges’ Consortium for Online Humanities Instruction, and various association-sponsored projects are setting common goals related to student data use, sharing practices, and holding one another accountable to make progress.
In addition to these “horizontal” collaborations, there are a growing number of “vertical” collaborations. In a number of communities, including Long Beach, California, and Orlando, Florida, four-year colleges, community colleges, and public school districts, are creating federated data warehouses and using those data to develop pathways through those systems and into local industries.
These collaborations seem poised to accelerate progress, but also amplify the concerns about interoperability, access to data, and de-identification that complicate any analysis of student data from multiple sources. At the same time, the collaborative structures present a great opportunity to develop standards for use, as some of the organizations mentioned above have begun to do.
Structure v. Autonomy
In conversations about responsible use of student data, privacy is the main focus—rightfully so, as it is an ethically and technically complicated issue. Although it is less recognized than the tension between privacy and data access, large-scale analysis of student data and the tools it makes possible also create a tension between structure and autonomy, which I believe is no less important.
This tension manifests across a few different dimensions. First, as identified in our Higher Ed Insights survey, there is a tension between the two major models of learning innovation that student data analysis supports: what Georgetown’s Randy Bass, at the Stanford Summit, called the “integrative” model and the “disaggregative” model. An example of the integrative model is the highly structured pathways, with aligned interventions to keep students on track, that many community colleges and public universities are adopting with the help of predictive analytics. An example of the disaggregative model is the unbundled, competency-based approach facilitated by learning outcomes badging and adaptive learning technology.
A second manifestation of the tension between structure and autonomy is in the perspective from which institutions view their students’ experience: do students benefit more from being sorted based on predicted outcomes, or from being allowed the freedom to explore and to fail?
A final manifestation of the structure/autonomy tension relates to instructors. To achieve an institution-wide impact, we would need to see coordinated pedagogical change across instructors. Yet the dominant attitude among faculty is that each instructor has complete control of his or her own classroom.
There are multiple risks to unclear standards for responsible use
At the moment, there are not generally accepted standards for responsible use of student data. To be sure, many uses for research fall under the authority of institutional review boards. However, if the data used are administrative and de-identified, or if the use is not “research” but instead evaluation of an institution’s own program or formative assessment, it is out of the IRB’s hands. And even when an IRB does have jurisdiction, there is disagreement among IRBs over how standard procedures should apply in the world of data mining and click-stream analysis—what sort of meaningful informed consent procedure can be used in such a world?
The most common concern about a lack of clear standards is that it will lead to overreach: information falling into wrong hands, used in ways not expected, or resulting in consequences students don’t agree to. But there are other risks, as well. Uncertainty can breed inaction, leading to less than optimal analysis of student data, or what might be called “underreach.” Lack of clear guidelines can also lead to empire-building by the business units or organizations that hold student data, resulting in uses that are aligned to their incentives but not necessarily the greater good of the institution, students, or the science of learning.
The responsibility to use
Adding to the point about underreach, and in light of the many institutions that are not making productive use of their student data, we might ask whether the concept of responsible use includes a responsibility TO use student data to improve instruction and support. If analysis of student data can be used to improve the experience and outcomes of students, does failure to undertake any analysis amount to an ethical lapse?
Relatedly, once the lid is open—once an institution’s leaders know that certain aspects of its bureaucracy are tripping up students, or that a particular subpopulation is struggling more than others, or that its students’ learning outcomes are uneven across programs, courses . . . or faculty—does that knowledge create an imperative to act to remedy the situation?
Thank you for this posting! I wrote a response to it here: http://remediatingassessment.blogspot.com/2016/04/the-data-of-learning-response-to-martin.html
The following blog post discusses how extensive learner information profiles could be managed by students to create and manage a: 1) a Personal Learning Environment and 2) a Personal Intelligent Digital Assistant.
New Uses of Student Information Data
See my response at: New Uses of Student Information Data