At Ithaka S+R, we regularly re-evaluate the quality and inclusivity of the demographic questions we include in our surveys, just as we do with all of the items in our instruments. These questions undergo a rigorous, iterative revision process: we conduct desk research, gather feedback from advisors, test all questions with potential participants prior to launching, and pay attention to trends in open-ended responses to our questions. All of these approaches, along with following trends in research and broader social movements such as Black Lives Matter, have led to recent updates to our demographic questions.  

While no demographic question can perfectly represent the complexity of identityand an imperfect solution may still sometimes be the best solutionwe’ve learned a lot over the years about how to capture this information effectively and inclusively. Today we are sharing four strategies for crafting demographic survey questions to better represent a survey sample. 

Consider what demographic questions are really needed

Rather than including a standard set of demographics in every questionnaire, consider how they will be used for each individual study. For example, what data do you need to understand the representativeness of the sample? What do you need to stratify results across groups? And, what steps will be taken to protect the confidentiality and/or anonymity of respondents if this information is collected? 

For studies addressing the demographic makeup of a particular field or organization such as the American Library Association’s Diversity Counts study or the Society of American Archivists A*CENSUS it makes sense to include a wider variety of demographics as that’s the purpose of the study. But not all surveys require that level of detail. Additionally, even when surveys are fully anonymous, this level of detail can put the anonymity and confidentiality of respondents at risk, particularly if they identify with multiple groups that are underrepresented in the survey. Thus, it is important to find a balance between being inclusive and ensuring confidentiality/anonymity. In our surveys, we work to find this balance through transparency with participants about how their results will be used and deposited, and we only analyze results in which we have a minimum, predetermined number of responses per group (typically 10 or more).  

Consider which demographic questions are likely to impact results. Since answering demographic questions can be uncomfortable for some participants, including only those that are clearly related to the thematic areas of the survey can help alleviate some of this discomfort. For instance, we considered adding a question on sexual orientation in the Library Survey 2020 given our expanded coverage of equity, diversity, and inclusion in the body of the survey. However, half our testers were uncomfortable with the additional question, asked why we included it, or chose not to answer. Because of this response, and because we did not have strong hypotheses about how sexual orientation alone would impact our results, we ultimately decided to remove the question.  

Keep up to date with evolving terminology from relevant communities 

Perhaps the most important consideration in crafting demographic questions are the response options used to identify groups. No response option perfectly encapsulates identity, but there are multiple ways to address this issue. 

One way is to include only open-ended demographic questions. While this format allows participants to identify themselves with their own terms, it might not be a practical choice, especially in surveys with tens of thousands or more participants. Further, even when researchers use open-ended demographic questions, responses generally get grouped into pre-existing categories anyway. Thus, even when using open-ended demographic questions, paying attention to the labels used to identify groups is important and these terms evolve over time. 

It is therefore vital to listen to members of the group in question about what terms are preferred. In many cases, more than one term may be preferred by different people within a group. In such cases, providing multiple terms may be best (e.g. a response option of “Black or African American” rather than only “Black” or “African American”). Further, there may be cases in which preferred terms cannot be adequately covered by a survey question. For example, people may identify themselves with a particular ethnic identity (e.g. Mexican) rather than a race-ethnicity category (e.g. Latinx). Including options for every ethnic identity, however, may make the question difficult for participants to sort through and for researchers to analyzealthough this may be appropriate for surveys specifically focused on ethnicity and/or if ethnicity is central to the hypotheses of the study. Thus, choosing to use an imperfect solutionfor example, including a response option of “Hispanic, Latino, Latina, or Latinx”can be a way to reduce the burden on participants and researchers. At the end of this post, we’ve included several sample demographic questions that we’ve used in our surveys.

Another consideration in labelling response options is how to identify groups outside of the normative categories. For example, many surveys include gender questions with “Male/Female” or “Man/Woman” as options along with an option for “Other.” Although this is an improvement on limiting the response options to only the binary genders, this approach literally and figuratively “others” non-binary participants. This is an area in which open-ended options can be useful. In our surveys, we include non-binary gender response options and use “Another option not listed (please specify)” to provide an additional option where respondents can write-in how they identify. This wording demonstrates an awareness of the limitations of the categories used in closed-ended demographic questions while also including a solution for those who identify outside of the identities provided. Further, since fewer participants tend to write-in responses when questions are formatted this way, qualitative analysis of responses is easier to perform.   

Pay attention to formatting

Response options are not the only way to make survey questions more inclusive. Formatting can also signal to participants that a question is or isn’t well thought out. Depending on the capabilities and limitations of the survey format, there are several options to consider: 

  • Order of response options. Since demographics are related to social hierarchies, it is important to ensure that the order of response options does not reinforce systems of powerfor example, ordering response options with the most privileged identities first. While we generally randomize the items displayed in our survey questions to reduce bias that results from order (priming, primacy, and recency effects), we typically present demographic information in static categories, most often alphabetized or in numerical order. Not only does this help avoid listing identity options hierarchically, but it also makes finding the applicable response option easier for participants. 
  • Multi-select format. For many demographic variables, it is possible that participants will identify with more than one group (e.g. multi-racial, non-binary). In these cases, it is important to allow participants to select more than one option. While this makes data analysis more complicated, it is a necessary step. 
  • Write-in options. Because response options are always imperfect to some extent, and some people may prefer identities or terms not listed, it is useful to provide write-in options. This allows participants to identify with their own terms. Write-in responses can also be used to adapt and improve existing questions. 
  • Skipping. Respondents should not be required to respond to demographic questions. In our surveys, these questionsin fact, most of our questions more broadlyare optional, and we disclose this to respondents at the beginning of the survey. Because some participants may not realize or remember that they can skip a question, including an explicit option to select that they don’t want to answer can also be helpful. We include “I prefer not to answer this question” as an option for each of our demographic questions. We also often include instructions again at the beginning of our set of demographic questions letting participants know they can skip any questions they would like not to answer. 

Provide mechanisms for feedback before, during, and after surveying

Finally, taking and incorporating feedback helps strengthen existing survey questions. It also aids in identifying missing pieces that can be added into the survey. Prior to surveying participants, feedback can be gathered from a variety of sources. For our national faculty and library director surveys, we typically discuss-high level survey themes with half a dozen to a dozen advisors who also preview and provide feedback on our survey drafts. They are able to candidly share any thoughts or concerns about the demographics questions. 

We also pretest our surveys through cognitive interviews with a small number of people from our target survey population. During this process, we typically ask about their comfort in answering demographic questions. 

After the survey has been fielded, write-in answers also allow participants to provide feedback that can be used to update questions in future surveys. For instance, in our recent national survey of art museum directors, which we fielded for the first time earlier this year, we asked, “Is there any aspect of your identity that we have not covered in the preceding questions that we may want to consider including in future studies? Please use the space below.” Qualitative analysis of these responses can demonstrate categories that are missing or insufficient and is especially helpful when surveying a new population. We also respond to questions and take feedback from participants through our general survey feedback inbox during and after the survey implementation. 

Concluding thoughts

As terminology continues to evolve and we gather feedback from our survey participants, we need to continually re-evaluate and adapt our survey instruments. While there is no perfect way to capture participant identity in a survey, we believe these strategies can simplify the process of crafting inclusive and effective demographic questions. 

This blog post is the first in a series of posts on our demographic questions related to individual identity. Subsequent posts will focus on recommendations for selecting and omitting these questions in surveys, and on analyzing demographic variables in an inclusive way while maintaining scientific rigor. We welcome your comments, questions, and reactions to these strategies and our sample questions in the comment box below or via email at jennifer.frederick@ithaka.org


The sample questions included below went through several iterations of updates across multiple surveys and will continue to change as knowledge on gender, race-ethnicity, and identity in general continues to expand. Please feel free to use these as inspiration for demographic questions in your own surveys.

Sample demographic questions

With which gender(s) do you most identify? Please select all that apply.

❑       Man

❑       Woman

❑       Non-binary

❑       Another option not listed here (please specify): ______

❑       I prefer not to answer this question [exclusive] 

Do you identify as transgender?

🔾 Yes

🔾 No

🔾 I prefer not to answer this question

Please select the population group or groups that you most closely identify with from the list below:

  American Indian or Alaska Native

  Asian or Asian American

  Black or African American

  Hispanic, Latino, Latina, or Latinx

  Middle Eastern or Northern African

  Native Hawaiian or Other Pacific Islander

  White

  Another option not listed here (please specify): _____

  I prefer not to answer this question [exclusive]