Recently, the Office of the Chief Statistician announced recommended revisions to the Office of Management and Budget’s (OMB) Statistical Policy Directive No. 15: Standards for Maintaining, Collecting, and Presenting Federal Data on Race and Ethnicity (SPD 15). In particular, the revisions recommend combining race and ethnicity into a single, select-all-that-apply question, including “Middle Eastern and North African” (MENA) as a new minimum category within the item, and collecting detailed race and ethnicity data in the same item (Office of Management and Budget 2023). This short piece addresses how existing interpretive approaches to question evaluation (Miller et al. 2014) assist in assessing construct validity in the context of the OMB recommended revisions. In so doing, these question evaluation methods can help answer how race and ethnicity, as constructs, can be effectively measured. While these changes present challenges, a focus on the meaning of race and ethnicity in respondents’ lives provides needed context to survey statistics, enhances survey question design, and improves the equity of data collected.
Two types of question evaluation have predominated in the federal statistical system. Each places emphasis on different aspects of the question-response process and, when used together, serve as complementary perspectives. The first, traditional approach to questionnaire design, emerged from the 1980’s Cognitive Aspects of Survey Methodology (CASM) movement and uses the methodological toolkit of cognitive psychology. It attempts to identify “problems” with survey items that could impact the accuracy and reliability of data collection (Willis 2015). This approach, which situates the question-response process (Tourangeau 1984) solely within the domain of individual cognitive processes, has the advantage of aligning neatly with survey methodology writ large. The second, an interpretive approach drawing on cognitive sociology, seeks to understand the “ways in which respondents interpret questions and apply those questions to their own lives, experiences, and perceptions” (Miller 2014). In doing so, it views respondents’ cognitive processing of survey questions as fundamentally informed by and inseparable from their lived experiences (Gerber and Wellens 1997; Miller 2003). From this perspective, improving question performance is impossible without considering respondents’ social location.
Particularly relevant to the proposed revisions to the race/ethnicity question are these approaches’ divergent perspectives on question intent. The traditional cognitive-psychological approach assesses item performance through structured probing of the phases of the question-response process: comprehension, recall, judgment, and response (Tourangeau 1984). This approach measures question quality by the number of problems identified. As in golf, the lower the number of problems, the better the question. This method’s ultimate goal is to (quantitatively) examine how closely data gathered match researcher intent. The interpretive approach, by contrast, is more agnostic to researcher intent. Instead, it seeks to identify “what [a question] captures” (Boeije and Willis 2013) without making assumptions about “correct” or “incorrect” interpretations (i.e., response error). The interpretive approach assumes that respondents understand and process questions through their lived experience and relate this to researchers in narrative form—in other words, respondents do not report on their cognitive processes but their social reality. In the process of employing both the cognitive-psychological and interpretive approaches, researchers gain a sense of potential and actual “problems” with questions and, more crucially, the meanings that respondents attach to key constructs and categories. In the context of the proposed revisions, the interpretive approach shifts the focus from the question itself to the broader constructs of race and ethnicity as they operate—and if they operate—in respondents’ social worlds.
The proposed revisions have reopened debate on whether race and ethnicity are worth measuring using standardized categories, or even measuring at all, as some respondents to the OMB public comment process proposed.[1] Strictly speaking, question evaluation methods cannot answer whether topics should be measured—they can only assess how questions function for respondents. Both interpretive and cognitive-psychological approaches to question evaluation will illuminate whether race and ethnicity are constructs that make sense to people and are ways they can categorize themselves, if only on forms and in surveys. A longstanding research agenda examining race and ethnicity questions has indicated that race and ethnicity structure substantial aspects of public life in the United States (Miller and Willson 2002; Willson and Dunston 2017). This is not to say that race and ethnicity are indicative of biological difference; rather, they are best understood as “socio-political constructs” (Office of Management and Budget 2023). The response options—both the “minimum categories” measured by SPD 15 and subcategories—seek to operationalize these constructs. Importantly, as social and personal understandings of race and ethnicity evolve, inclusion of new categories as response options can positively impact data quality for smaller groups without meaningful reductions in data quality for non-group members (see also U.S. Department of Commerce 2017). Thus, question evaluation of the addition of a new category for MENA respondents can illuminate the degree to which race and ethnicity remain consistent, predictable, and measurable across time and space.[2]
Question evaluation methods provide insight on how to align question design with actual constructs of interest. While both cognitive-psychological and interpretive perspectives should be employed, an interpretive approach to question evaluation is uniquely suited to assessing the measurement of identity categories, such as race and ethnicity. This is because the interpretive approach frames question evaluation to ask how constructions of race and ethnicity impact respondents’ lives. For example, prior evaluation of the single-question combined measure of race and ethnicity, which included a “Middle Eastern and North African” category and ethnicity sub-categories, identified four primary patterns of interpretation of race and ethnicity among respondents: as ancestry (genealogy, otherwise known as a person’s family tree), as cultural affinity or belonging (connectedness to a group based on shared culture), as an administrative category with varying responses depending on question purpose (response depends on the type of form, such as a medical form or driver’s license), and as a function of others’ perception of respondents in society (how respondents’ race is viewed by others in the United States) (Willson and Dunston 2017). These patterns remained consistent with the results of early evaluation of the select-all-that-apply approach to asking about race (Miller and Willson 2002), and though the prevalence of these patterns is not known, none of the patterns indicate response error. A key benefit to this approach to question evaluation is that it can closely link research on race and ethnicity measures with social scientific research on the socio-political constructs of race and ethnicity more broadly, as this literature is also concerned with the understanding and operationalization of race and ethnicity as constructs.[3]
As the proposed revisions to the race and ethnicity question set are considered, evaluators of these items should reflect on not only the potential problems that respondents may experience when encountering them but how these constructs function in respondents’ lives. Attention to the meaning that survey respondents attach to race and ethnicity not only leads to question design informed by the socio-political context of these constructs, it also provides essential information to survey statistics users and will lead to more equitable federal data collection.
The findings and conclusions in this article are those of the authors and do not necessarily represent the official position of the National Center for Health Statistics, Centers for Disease Control and Prevention.
Zachary Smith, National Center for Health Statistics, qks4@cdc.gov
See, for example, comments in Schneider (2023).
For similar evaluation in the context of sex and gender identity, see Miller, Willson, and Ryan (2021) and Miller and Willson (2022).
See, for example, in sociology, Brunsma, Embrick, and Nanney (2015), Suzuki (2017), and Brubaker (2009); in political science, Taylor (1996), Persons, ed. (1999), and Hutchings and Valentino (2004); and in the health sciences, Dressler, Oths, and Gravlee (2005), Lett et al. (2022), and Adkins-Jackson et al. (2021). These are far from an exhaustive list.