Sharon Slade and Emily Schneider

Summary of Discussion
Student Data and Records in the Digital Era
Asilomar, CA
June 15‐17, 2016

This group discussed the main areas of focus and most promising types of applications of student data for the postsecondary community over the next few years. The group had participation from approximately twenty leaders within their own fields, bringing a range of perspectives from law, government, large private universities, community colleges and distance learning institutions. These notes summarize the discussion and make draft suggestions for next steps.

The stated objectives of the group were to:

  • Identify the main areas of focus and most promising types of applications for the postsecondary community over the next few years;
  • Articulate any types of application or any components of applications (for example, data types or methodologies) that may warrant additional consideration;
  • Begin to establish a set of guiding principles which seek to minimize the risks associated with the use of student data to guide or drive their learning.

Establishing the primary goals of higher education

The group agreed that a first useful step in reviewing uses of student data was to examine the primary goals of postsecondary education institutions (PSEIs). That is, in order to better understand the applications of student data, we should first seek to understand how the collection, analysis and application of student data serves each PSEI.

Given the range of stakeholders represented within the group, the discussion yielded an interesting variation of key purpose. These touched upon societal benefit at a large scale (“to transform lives for the benefit of society”) and at the individual level (“as a means to social mobility”); access to educational excellence (“the creation of a contemporary and innovative educational experience”); to access for all (through MOOCs, open access, online programs); to preparation for life and work, regional development and supporting particular communities.

Most were agreed that PSEIs aim to deliver multiple benefits (at one school, for example, the stated aim is an education which “engages students in social, ethical and cultural concerns; stimulates their intellectual curiosity; educates them for civic responsibility; and develops creative and critical reasoning skills to prepare them for a lifetime of inquiry, productivity and leadership. …. also to put knowledge into action and prepare work force-ready graduates.”). However, it was also recognized that there would be likely differences between public and private providers and differences of focus between various stakeholder groups.

Student data as a means to measure ‘success’

Having established that PSEIs have broad and potentially conflicting goals, the group discussed how data might be used to measure whether those goals are met, i.e., what are the accepted measures of success?

Typically the view was that most PSEIs focus on a few key measures – most typically persistence (retention of students); but also completion (meeting students’ declared educational goals) and post-graduation employment or study. Persistence in particular might be seen as a valuable measure of institutional success – perhaps more closely linked to financial sustainability (meeting funding thresholds, say) than societal impact.

In addition, some goals may be externally imposed. In Florida, performance based funding is thought to have shaped a focus on targets and, in particular, on graduating students. Similarly, other institutions have state-wide mandates to maximize graduation or to close achievement gaps.

There was agreement that measures of success could vary depending on the stakeholder (student/instructor/institution/administrator) and that measures should ideally reflect and be defined by students’ needs. For example, a measure of success for a student might be to study a particular subject regardless of predicted grade rather than be enrolled onto another study program which has greater predicted chance of success but may be of lesser interest. The issue perhaps is around who decides what is in students’ best interests – the student as consumer/customer or the institution as paternal provider. The potential risks are that actions may be taken which may serve interests other than those of students. It was also considered that success can be a multi-layered issue that is insufficiently captured by simplistic measures such as graduation rates.

Potential applications of student data

Having established that there is not always a clear existing link between the stated aim of a PSEI and the ways in which it uses student data, the group discussed the wide range of existing or potential applications. These included, but are not limited to:

  • Admissions and enrollment – establishing criteria to select prospective students (according to demographic characteristics and academic study history) and to predict likely success. This might also include filtering of students thought likely to adversely impact retention/graduation rates or who are judged to require too much financial assistance;
  • Giving students additional insight and agency – reflecting back progress, providing links to their instructor/resources, or proactive exposure to future options;
  • Evaluating advisory support – understanding how/which interventions may be most effective and sharing the approach with others, developing an improved understanding of pastoral support;
  • Predictive analytics – at the beginning of a course or of a degree program, better understanding of key links between demographics/study behaviors and outcomes (persistence, completion, etc);
  • Evaluating the quality of learning, i.e., better understanding what students know, reviewing the learning process and the impact of teaching and learning design;
  • Understanding the value of a measure itself, i.e., whether a particular measure of a phenomenon is a fair proxy for knowledge acquisition;
  • Performance measurement – as an evaluation of the effectiveness of faculty, e.g., student persistence/completion, or times spent on preparation and support;
  • Development of useful longitudinal data – following students over time to establish robust datasets/understanding and examine apparent trends;
  • Building social networks – establishing and understanding student/learning communities;
  • Insight into metacognitive and motivation factors;
  • Facilitating the ability for combining information across multiple sources who might work together or in tandem to support students.

Understanding the challenges of using student data

Optimism associated with the potential offered by greater insight into and the links between student behaviors, characteristics and/or learning design and eventual outcomes can obscure the many challenges associated with analytics. The impact of interventions based on visible data can be incredibly difficult to evaluate (correlation vs. causation). In addition, greater insight into behaviors linked to likely success does not guarantee student compliance (‘you can lead a horse to water but you can’t make it drink’).

The group had a wide-ranging discussion around some of the challenges and grouped these loosely under practical, ethical, predictive and consent issues. These included:

Practical challenges

  • Although perhaps obvious, participants discussed the shortcomings of recording and measuring what is available rather than what is useful or relevant (faculty are trained as critical thinkers/ researchers – there can be a tendency to keep interrogating the data until patterns/correlations are found). Purposes of use should be explicit;
  • PSEIs are data-rich but often resource-constrained (both in terms of staff to undertake analytics and staff to deliver suggested interventions/provide support and/or fund external providers for same);
  • Should focus be on the individual learner or on learning design? Faculty resistance and/or resource constraints may limit options for meaningful change;
  • Skill sets may be lacking, e.g. adjunct faculty are often not engaged in training/communication events but have responsibilities for delivering high impact gateway classes. Challenges around equipping staff with data literacy skills (training the pipeline of future faculty);
  • Large scale adoption of analytics requires cultural and institutional change – a need for faculty/administrative buy-in; greater understanding of best practice elsewhere; time available to explore and adopt new ways of supporting students – communication is key here;
  • Data may be difficult to access, held in multiple places under multiple jurisdictions, and available data proxies for phenomena of interest may not be adequate.

Ethical challenges

  • Is it always possible to know when data use causes inadvertent harm (and harm to who? This back to who is the primary stakeholder and in whose interests);
  • What are the boundaries around usable data, i.e., should we include information from out-of-school activities which might be thought to have some link with study, such as social or exercise habits? What must/should always be out of scope? (e.g., counselling data);
  • What are the dangers of acting largely on demographic data (i.e., official ‘facts’ beyond a student’s control)? Classic risks of labelling are relevant here. Care needed to ensure that student opportunity is as equitable as feasible. Where demographics are explicitly used, the purpose should be well defined and/or well proven (e.g., in increasing opportunities for certain student groups);
  • What are the boundaries around access to student information and who can act on it? Student voice should be considered wherever practicable;
  • Interventions based on analytics are often experimental – how should PSEIs define harm (to how many students? To what extent can ‘harm’ be sanctioned in the interests of better understanding?). There seems to be no consistent approach to managing this, and often a gap between research IRBs and operational approvals;
  • Resource constraints may lead to a need to prioritize one faculty/groups of students/intervention type above others – which has most impact, in whose best interests (institution: maximize completion vs. student: get best grade that they can/study what they want);
  • Privacy issues – data-sharing with third parties, FERPA and other relevant regulations.  There appear to be inconsistencies/gaps in institutional governance/policies with regard to how data is shared between parties. It was suggested that a shared/federated approach to governance might be useful;
  • Data can easily lose its original meaning when taken out of context and/or combined with other data;
  • Student information is a temporal thing – a static dataset cannot be assumed to be perpetually representative;
  • Financial drivers can override known issues, e.g. acceptance of students who may share characteristics associated with poorer completion in order to bolster funding;
  • Issues around surveillance and ‘big brother’;
  • Establishing boundaries on data use at the point of registration means that we cannot later do something different with that data much further down the line (e.g., sell student data to third parties once students have graduated);
  • Should students be able to see the detail of how/when/by whom data describing them has been tracked?
  • Establishing trust between institution and students is fundamental.

Particular issues with predictive analytics

  • Understanding the best ways to present predicted outcomes to students can be difficult. There were suggestions that focus should be on presenting a positive outlook (‘based on our information, you’re likely to succeed if you enter into this program’); an emphasis on phrasing discussions around ‘students like you’ rather than on the specific student
    • Suggestions that predictions should be filtered through a conversation with an advisor rather than reflected directly to students (but is this feasible at scale/distance?)
    • Additional concern that the features which determine ‘students like me’ may actually be less relevant to the scenario/student than other more crucial factors which may be less easily observed
    • Particular concerns round understanding the difference between causality and correlation
    • Also concerns that expressing ‘likely’ outcomes can be self-fulfilling;
  • How much freedom to choose should students be allowed if their predicted outcome is poor? Risks to student agency if choices are constrained by algorithms;
  • There can be a lack of transparency around how predictions are made (for both students and users) which can make results harder to understand/question;
  • A strong sense that outcomes should not be solelydetermined by predictive analytics – suggestion that predictions are most appropriate for informing design/policy change rather than determine individual student outcomes.

Consent issues

  • Addressing consent ought to be part of institutional terms and conditions but in a way that is meaningful and likely to be engaged with
    • Even if we accept that many students won’t engage with the detail, there is an obligation to act in the ways stated
    • Does additional consent need to be obtained at a faculty level to ensure that any unforeseen/additional activity is covered?
  • A requirement for clear boundaries around what requires consent and what does not
    • Making clear that data collected for a stated specific purpose (e.g., financial aid) does not automatically become available for other purposes
    • Making much clearer the purposes of collecting/tracking certain datasets and how that adds to the pedagogic/student support issues
    • Understanding which datasets are particularly sensitive when tracked at a personal (and identifiable) level (e.g. tech which analyses speech patterns);
  • Explicitly seeking consent may raise expectations to act on shared/tracked information;
  • Consent may be problematic if students become more aware that specific characteristics may be linked to particular outcomes, e.g., if you’re in a racial group that is often negatively profiled, do you want your data tracked?
  • Expectations of opt-out as a default option may be flawed – e.g., classroom observation would be difficult to exclude;
  • Opt-out should ideally translate to exclusion from actions applied to a student on the basis of  student discretion rather than exclusion from analysis (which would potentially weaken the validity of analyses);
  • Practical suggestion around trialing a student privacy dashboard where students can opt in/out of certain kinds of activities and being observed;
  • A general view that much of what we all do is already tracked – we probably all know this and are largely comfortable enough (or too apathetic) to bother to find ways of resisting personal data collection.

Next steps

The group agreed that a valuable step forward would be to establish a template or broad set of principles which can be expanded to include institutionally-specific context/priorities. The working group should ideally have broad representation (of institutional and roles). The group could take forward a number of actions which might usefully include:

  • Establishing definitions/scope/boundaries/purpose/audience
  • Conducting an environmental scan of existing data sources/processes/policies/legal issues
  • Distilling/drafting set of good practice examples
  • Drafting template of principles which include clear guidance for key issues and broad outline for others
  • Sharing and incorporating feedback/use cases and examples from a broad set of institutional stakeholders/representatives
  • If feasible, monitoring/measuring the adoption of the template (if an ‘owner’ can be identified) and adapting as needed.