Surveys are an important tool for researchers across disciplines working to reduce disparities. When detailed frame data are not available to describe respondents, self-report is needed to understand respondent characteristics. However, some demographic questions are largely recognized as sensitive (Tourangeau and Yan 2007) and may be left unanswered due to confidentiality concerns or the perception that the question is threatening or difficult to answer (Lor et al. 2017). Furthermore, respondents may choose to skip the entire survey due to confidentiality concerns (Singer, Hippler, and Schwarz 1992).
Researchers have attempted to attenuate the sensitive nature of demographic questions using a variety of methods, for example, by stating the purpose of collecting demographic items, placing demographic questions at the end of the survey (Lor et al. 2017), or assuring respondent confidentiality (Singer, Hippler, and Schwarz 1992). However, the impact of whether and how demographic questions are asked on response rates and other data quality measures is largely unstudied. To our knowledge, there is no empirical evidence evaluating the impact of demographic question inclusion in a survey (Bradburn, Sudman, and Wansink 2004; Dillman 2007; Dillman, Smyth, and Christian 2009). As such, we tested several methods for reducing the sensitivity of demographic questions in a mailed survey.
Our research aims to answer the following three questions in the context of mailed paper survey administration: (1) What impact does the inclusion of demographic questions have on survey response rate? (2) Will separating the demographic questions from the rest of the survey on a standalone piece of paper that restates the optional nature of each question impact response rate? (3) Will labeling the demographic questions as above affect measurement properties including individual item nonresponse or discordance between administrative and self-reported data? Answers to these questions could help shed light on the effectiveness of different strategies survey researchers use to mitigate the sensitivity of demographic questions across disciplines and potentially across modes.
Methods
This randomized experiment was embedded in a larger survey-based evaluation project designed to measure opinions and beliefs about community mental illness stigma. The population of interest included adults residing in six Midwest communities. The study was conducted within a large, integrated health system whose patients and members were used as a convenient proxy for the underlying communities in which they reside. The sample size (n=4,448) was selected to achieve a 5% margin of error on key outcomes in each community. The sample was randomly selected from the population and then randomly assigned to one of three paper questionnaire conditions: no demographic questions, integrated demographic questions, and standalone demographic questions (see Table 1).
The four demographic questions, when included, were adapted from standard public health surveillance instruments and asked about annual household income, education, ethnicity, and race. In the integrated survey version, the questions were at the end of the survey with the following transition, “We have a few more questions about you. These questions help us understand your responses better. As a reminder, your information will be kept confidential and secure.” In the standalone demographic question version, the questions were printed on a sheet of yellow paper in contrast to the white paper on which the main survey was printed. The questions were introduced with the following text, “We have a few more questions about you. These questions are optional and help us understand your responses better. As a reminder, your information will be kept confidential and secure. If you choose to fill out this page, please return it with your survey in the envelope provided.” (See supplemental materials.) The rest of the survey comprised four pages and included questions about experience with individuals impacted by mental illness, awareness, and willingness to take action.
The survey was mailed May 2019 with a $2 bill and cover letter describing the survey and its overall optional nature. (See supplemental materials.) A unique study identification number linked the survey content to the standalone page when applicable. Mail survey non-responders transitioned to telephone follow-up after at least 21 days as part of the sequential mixed-mode survey design, but the demographic experiment was limited to the mail phase. Mail data collection closed September 2019.
The integrated health system from which the sample was drawn collects administrative data on ethnicity and race as reported by the member/patient and/or clinic staff. In the administrative data, ethnicity is recorded as Hispanic/Latino, not Hispanic/Latino, or unknown. Race is recorded as Native Hawaiian/other Pacific Islander, American Indian or Alaska Native, Asian, Black/African American, White, other, or unknown. If the administrative data indicated multiple races, the indicated race least common in the state according to US Census data current as of July 1, 2019, was assigned as the subject’s primary race. The same process was applied to self-reported data to create comparable measures between administrative and self-reported demographics.
The primary outcome variable was response rate. Since these surveys were embedded within a larger evaluation and any respondent feedback was valuable, a survey returned with at least one item completed was considered a response, regardless of whether that item was related to the primary outcomes or was a demographic item (importantly 99% of respondents completed more than half of the survey). Secondary outcomes include demographic questions completion (defined by completion of at least one question on the demographic question set) and alignment of self-reported demographics with administrative records. Endorsement of “Decline to answer,” available only for the income question, was considered non-response. Pearson’s chi-squared tests were used to assess survey completion rate differences, demographic item completion, and discordance between self-reported and administrative demographics. Surveys returned undeliverable due to invalid postal address were removed from analyses. Only individuals with known self-reported and administrative demographic characteristics were included in analyses of data source alignment. Statistical analyses were conducted in R version 3.6.1 (R Core Team 2019)
This project was embedded in a program evaluation designed for future program planning that was deemed exempt from institutional review board (IRB) oversight by the organization’s IRB.
Results
Demographic characteristics of the 4,448 sampled individuals as recorded in the administrative data were balanced across the three conditions (see Table 1). The sample of randomly selected members and patients ranged in age from 18–102 years with most individuals between 55 and 64 years old; primarily female with commercial insurance, and not Hispanic/Latino, and White. Of the 4,448 surveys fielded, one survey was returned undeliverable due to invalid address and was removed from subsequent analyses. Of the remaining 4,447 individuals, 1,487 responded to the survey by mail (33.4% response rate, minimum response rate 2) (American Association for Public Opinion Research 2016). While a survey returned with at least one item completed was considered a survey response, over 99% of respondents completed at least half of the survey items. Mail response rates to the main survey were 34.2% when no demographic questions were included, 33.1% when demographic questions were standalone, and 33.0% when demographic questions were integrated into the main survey. Differences were not statistically significant (Table 2; χ2 = 0.545, p = 0.762). Overall, responders were more likely to be female, older, have Medicaid and be White. This pattern held across arms except for gender. Only in the standalone arm were women significantly more likely to respond (results not shown).
Overall, in the two arms with the demographic questions included (integrated and standalone), 30.5% of individuals responded to at least one demographic question. At least one demographic question was reported by a larger subset of the sample when the demographic questions were integrated (32.7%) compared to when they were standalone (28.3%; χ2= 6.38, p = 0.012; Table 2). Of survey respondents, 99.0% responded to at least one demographic question when integrated, but only 85.5% when standalone (χ2= 60.0, p < 0.001).
Item nonresponse rates to demographic questions ranged across demographic questions and arms from 0.41% to 16.5% (see Table 3). There was more item nonresponse to the income question among respondents of the integrated demographic questions than among respondents of the standalone demographic questions (16.5% compared to 10.5%; χ2= 6.46, p = 0.011). However, despite this higher item nonresponse rate, due to the higher overall survey response rate, the integrated demographics condition ultimately produced more data on respondent income than the standalone demographics condition. Item nonresponse did not vary significantly across survey conditions for the three other demographic questions.
Across the study population, race and ethnicity information was available for 75% and 96% of the sample, respectively (Table 1). For respondents whose race and ethnicity were known through administrative records and who also self-reported race and ethnicity on the survey, discordance ranged from 0.6% to 1.0%. We found no significant differences between discordance rates of the integrated and standalone study arms (for ethnicity, χ2 = 0.02, p = 0.88 and for race, χ2 = 0.22, p = 0.64; see Table 4).
Discussion
We sought to answer three questions. (1) What impact does including demographic questions have on survey response rate? We demonstrated no impact. (2) Will separating the demographic questions from the rest of the survey on a standalone page impact response rate? We demonstrated no impact. And (3) Does separation of demographic questions in a standalone page impact measurement properties including individual item nonresponse or discordance between administrative and self-reported data? We found that whether the demographic questions are integrated or standalone can influence a respondent’s choice to answer certain demographic questions. When demographic questions were integrated, we saw more respondents answer at least one demographic question compared to when standalone. However, integrated demographics also led to more item nonresponse compared to when standalone. Specifically, six percentage points more respondents refused to provide information about their income when demographics were integrated into the survey compared to when standalone. Pragmatically, our results suggest that embedding demographic questions in a survey (as opposed to on a separate page) may result in more usable demographic data. Respondents more hesitant about answering demographic questions may complete the survey but not mail back the standalone demographic questions, but if those questions were embedded in the survey, they may mail back the survey with at least some responses to demographic items. Nonetheless, the higher rate of missing data in the integrated condition could result in different estimates of income if the choice not to respond is correlated with income. While data spread could be investigated empirically, we would not be able to disentangle the impact of selective item non-response, unit nonresponse and measurement properties with our current design. With more robust frame data, future work could explore this as well as the differential impact of post-survey missing data adjustments on estimates of demographic characteristics and correlation with other survey content that is beyond the scope of this work.
Importantly, there was no impact on measurement error; self-reported ethnicity and race corresponded highly with administrative data. At most, self-reported and administrative ethnicity and race data discorded by 1%. These data suggest that standalone demographic questions can impact measurement through differential item nonresponse, but not directly through measurement error.
This work shows that demographic questions can be included in a mailed survey with confidence when needed. Response rates were not negatively impacted, and we showed high concordance between self-report of race/ethnicity and administrative data. Question integration into the survey is optimal. While results may vary by survey mode, population, and topic, among other survey design factors, our results give confidence that the inclusion of these important questions in a mail survey does not bias our data.
Our findings have limitations. The experiment was conducted in a single institution that serves a relatively homogeneous population. Compared to the adult population in Minnesota, sampled individuals were younger and more likely to be female, White and insured. Compared to the United States, Minnesota is relatively White (U.S. Census Bureau 2019). It is possible that the existing relationships with the sponsor could engender recipients’ trust and thus impact willingness to return demographic information. The study was limited to four demographic constructs and a single question for each. We saw a different pattern with income nonresponse, suggesting, not surprisingly, that not all demographic constructs are equal. We did not test variations of questions for each construct and are not commenting on the appropriateness of these specific questions; our findings are limited to the specific constructs and questions used. Results may also vary with surveys that collect more demographic questions than the four that we tested.
The study was constrained to the first part of a sequential mixed-mode survey (mail with phone follow-up) due to the inability to replicate fully in phone administration. Results may differ with other modes, however, could have relevance to self–administered web-based questionnaires. The experiment was embedded in a survey about the stigma of mental illness. This topic’s salience could be correlated with a respondent’s willingness to volunteer demographic information. Similarly, for this experiment, we achieved an approximate 33% mail survey response rate. There may be a complex relationship between the impact of demographic questions along the continuum of response that could be evaluated. Finally, there was unavoidable confounding between survey length and presence of demographic questions. However, this is mitigated somewhat by not observing differences in response rates by demographic condition.
Our study also had important strengths. We utilized a randomized design embedded in a large study without unduly increasing respondent burden or the main study objectives. We had robust frame data to enable insight on measurement error across conditions and also achieved a relatively high response rate to assuage concern about differences along the response continuum.
Conclusion
Understanding the sociodemographic characteristics of survey respondents is important to give responses context. Moreover, understanding how beliefs, opinions and behaviors may differ by subpopulations is critical if underlying disparities are to be documented and ultimately addressed. Here we have shown that relying on self-reported demographics does not negatively impact survey performance, thus further warranting their inclusion. Future work should replicate this experiment in other modes and with other demographic questions as well as consider the potential differential impact of demographic question presentation for subpopulations. For example, in our data there was a signal for differential impact on response rate by gender. We have provided evidence to support inclusion of demographic questions in surveys, enabling the important work on documentation and analysis of disparities when administrative data is not available.
Funding
This study was funded by Lakeview Health Foundation and HealthPartners. While the authors are employed by these funding sources, the sponsors themselves had no role in study design; in collection, analysis, and interpretation of data; in report writing; or in the decision to submit for publication.
Declaration of interest statement
Declarations of interest: none