Introduction
In 2022, about one-in-eight people (11.5%) in Nebraska spoke a language other than English at home (U.S. Census Bureau 2022d). Spanish is the second most spoken language in Nebraska after English, with about 61% of non-English speaking adults in Nebraska speaking Spanish at home. Additionally, about half of the Spanish-speaking population is considered limited English proficient (U.S. Census Bureau 2022b). Thus, surveys aiming to include all adults in Nebraska must include languages other than English, and Spanish in particular. This paper examines a survey experiment that includes Spanish-language materials in a concurrent mixed-mode mail-and-web survey.
According to social exchange theory, individuals have a higher probability of agreeing with a survey request when the perceived benefits of participation outweigh the perceived costs (Dillman, Smyth, and Christian 2014). When surveys are translated into Spanish, the cost of participating is presumably lowered for adults who prefer Spanish, including monolingual Spanish speakers and bilingual adults who prefer Spanish over English. In this sense, adding Spanish-language materials should reduce the barriers faced by Spanish speakers, thereby increasing their participation in surveys conducted in Spanish and ultimately improving overall response rates.
Despite the potential benefits of adding language-specific materials, their efficacy in self-administered mail or mixed-mode surveys is mixed. Supplemental Table S1 summarizes past studies that experimentally tested adding Spanish-language materials to mail or mixed-mode surveys. Including Spanish-language materials sometimes increased response rates (Brick et al. 2012; Govern and Reiser 2008), sometimes had no effect (Brick et al. 2019; Leary et al. 2022) or even decreased response rates, leading to a “backfire” effect (Brick et al. 2019; Wagner et al. 2023). In each case, the magnitude of the effect is less than 5 percentage points.
One important variation across studies is the groups to whom the materials were distributed. Early experiments targeted Spanish-language materials to areas that had higher concentrations of Spanish-speaking individuals or those with Hispanic surnames, showing small response rate increases (Bouffard and Tancreto 2006; Brick et al. 2012; Govern and Reiser 2008). Recent experiments distributed Spanish-language materials more broadly, including to areas with lower concentrations of Spanish-speakers, generally with little or negative effects on response rates (Brick et al. 2019; Leary et al. 2022; Wagner et al. 2023). It is not clear whether the different effects in these studies are because of different target populations or decreases in the effectiveness of Spanish-language materials over time.
This study experimentally sends Spanish-language survey materials to a general adult population in the state of Nebraska. Our first research question is about response rates:
RQ1: Does offering Spanish-language materials affect response rates and bring in more Spanish-language participants? Does the effect of offering Spanish-language materials on response rates and language of participation differ across areas or addresses that are more likely to have Spanish-speaking residents?
We test three hypotheses related to response rates and participation language. First, we anticipate higher response rates with Spanish-language materials included than with English-only materials, including a higher rate of Spanish-language questionnaires (H1). However, a backfire effect is possible, in which sample members who do not speak English see the survey as not relevant to them or more burdensome, decreasing response rates when a Spanish-language questionnaire is included (H2). Finally, we hypothesize that the areas or addresses with higher probabilities of having Spanish-speakers have higher response rates and Spanish-language completions when Spanish-language materials are offered than those who are less likely to be Spanish-speakers (H3).
An important driver of both survey costs and errors is the mode of data collection. Although the experiments in Supplemental Table S1 sent paper questionnaires to potential respondents, those few studies that offer multiple modes tend to find higher completion rates on the salient paper questionnaire than the other (web, phone) less-salient mode (Elliott et al. 2019; Leary et al. 2022). Whether respondents to a Spanish-language questionnaire tend to select a paper or web questionnaire when offered a choice is an important cost driver. Thus, we examine our second research question:
RQ2: Does the language of the questionnaire affect the selection of mode of data collection overall and by Spanish-speaking respondents?
We hypothesize that Spanish-language respondents will be more likely to participate via mail than web (H4).
Whether adding Spanish-language materials changes sample composition—especially increasing the proportion of Hispanic or Spanish-speaking respondents— is also of interest. As such, our third research question is:
RQ3: Does offering Spanish-language materials change sample composition compared to not offering Spanish-language materials?
Some experiments have shown that Spanish-language materials recruit more Hispanic and/or Spanish-speaking respondents (Bouffard and Tancreto 2006; Brick et al. 2012; Leary et al. 2022); others have not examined this question or not shown a difference in sample composition (Brick et al. 2019; Wagner et al. 2023). We anticipate a higher proportion of Hispanic adults and those who speak Spanish at home when Spanish-language materials are offered (H4). In Nebraska, 66.7% of Hispanic adults in Nebraska are aged 18 to 44, compared to 47.4% of the state overall (U.S. Census Bureau 2022a). Additionally, 53% of Hispanic adults are men, while only 50% of adults are men in the state overall. Because Hispanic adults are younger and more likely to be male than Nebraska as a whole, we anticipate that offering a Spanish-language questionnaire will bring in a higher proportion of younger adults and of men than offering English alone (H5).
Data and Methods
The 2023 Workforce Study of the Central Nebraska Area, sponsored by the Nebraska Department of Labor and conducted between January and May 2023, was used for the Spanish-language experiment. This concurrent web-and-mail survey used a stratified random sample of 9,500 addresses selected by Dynata from the US Postal Service’s Delivery Sequence File of zip codes in the central region of the state, including the city of Grand Island. The target population was adults 19 and older (Nebraska’s age of majority) living in Nebraska (American Association for Public Opinion Research [AAPOR] Response Rate 2 [RR2]=15.5%, n=1,422). One adult in the household with the next birthday was selected as the respondent.
The sample was randomly divided into two conditions, each with n=4,750 addresses; 321 addresses were ineligible, reducing the total sample to n=9,179. Half of the addresses (n=4,583 eligible addresses) received all materials in English only (the English-only condition), including survey packets with an eight-page English paper questionnaire, postage-paid return envelope, and instructions on the cover letters in English of how to access the web questionnaire via link and QR code, followed by a postcard reminder and a final survey packet to nonrespondents. The addresses in the English-and-Spanish condition received all materials in both English and Spanish, including initial and follow-up survey packets with both English and Spanish paper questionnaires and cover letters and postcard in both English and Spanish (n=4,596 eligible addresses). The web survey link and QR code for both conditions led to a web questionnaire that offered the same Spanish-language translation, permitting those in the English-only condition to respond online in Spanish if they saw the drop-down menu option (see Supplemental Materials S2-S5 for examples of the study materials in English and Spanish).
All materials, including the cover letters, postcard, and questionnaire, were translated from English to Spanish using a committee approach by the Nebraska Translator and Interpreter Corps. Two translators undertook the task independently, then compared their translations. The final versions were adjudicated by a third translation expert.
Independent Variables
Language Condition. The key independent variable is the experimental condition to which the sampled address was assigned: English only or English-and-Spanish.
Spanish-likely households. We used two variables from the frame to examine likely Spanish-speaking households. First, we categorized Census tracts into those with higher (>=10%, n=3,449) and lower (<10%, n=5,730) proportions of Hispanic residents. Second, we used whether the sample provider indicated that the address had someone with a Hispanic surname (n=1,490) versus those that did not (n=7,689).
Dependent Variables
Survey Participation. Survey participation is a dichotomous indicator for whether the sampled address participated in the survey. This indicator variable was used to calculate response rates using the AAPOR RR2, which includes full and partial completions.
Language. The language used for completing the survey was measured as English or Spanish.
Mode. The survey mode that the respondent selected for completing the questionnaire was web or mail.
Respondent Characteristics. Respondent characteristics included ethnicity (Hispanic, non-Hispanic); primary language spoken at home (Spanish or other languages versus English); age (in categories of 19–44, 45–64, and 65+); and sex (binary male or female). As other important respondent characteristics for survey researchers, we examined education, marital status, and employment status. We compared all demographic characteristics to benchmark population data for the study region from the American Community Survey 5 Year Estimates from 2018 to 2022 (U.S. Census Bureau 2022b).
Analysis
For RQ1, we compared unweighted response rates between the two language conditions, testing for significant differences using chi-squared tests. We examined the effect of including Spanish-language materials on response rates overall and across our indicators of higher likelihood of being a Spanish-speaking household. We also reported the proportion of responses that were provided to the English-language questionnaire and the Spanish-language questionnaire.
For RQ2, we examined the unweighted response mode across the language conditions. We tested whether this was different using chi-squared tests. We noted completion mode for Spanish-language and English-language respondents separately.
For RQ3, we examined whether the respondent demographic characteristics varied by condition. Missing data on the respondent characteristics was multiply imputed 20 times using mi impute; all analyses accounted for multiple imputation and the complex sample design (with probability of selection weights) using design-adjusted F-tests. All analyses were conducted in Stata 17.0.
Results
Response Rates. Adding Spanish-language materials significantly lowered response rates (English only=16.3%; English-and-Spanish=14.7%; Table 1), supporting the backfire effect hypothesized in H2, but not in H1. Counter to H3, response rates were significantly lower when Spanish-language materials were included in areas with 10% or more Hispanic residents and (not significantly) for those with Hispanic surnames.
Language. Despite the deliberate inclusion of Spanish-language materials, only 1.0% (n=15 respondents) participated using the Spanish-language questionnaire. Almost all Spanish participants came in the English-and-Spanish condition (2.1% of the English-and-Spanish respondents participated in Spanish); only one respondent selected the Spanish web survey in the English-only condition (0.1% of the English-only respondents), a significant difference across the experimental conditions, consistent with H1.
Almost all of the Spanish-language respondents came from areas with high concentrations of Hispanic residents or had Hispanic surnames as indicated in the sampling frame. Only 0.2% of respondents in the lower Hispanic concentration areas or that did not have Hispanic surnames participated in Spanish in both conditions. In the higher Hispanic concentration areas, 3.5% participated in Spanish, all from the English-and-Spanish condition. Furthermore, 16.7% of addresses with Hispanic surnames returned Spanish-language questionnaires, all from the English-and-Spanish condition. When those with Hispanic surnames were sent Spanish-language questionnaires, 33% participated in Spanish. Thus, consistent with H3, those addresses with a higher likelihood of speaking Spanish were the users of the Spanish questionnaires when included in the paper materials.
Mode. In both experimental conditions, around two-thirds of respondents used mail. Both English and Spanish-language participants participated primarily by mail, although Spanish responses by mail were by definition only possible in the English-and-Spanish condition.
Demographic Composition. Sample composition was statistically identical across the experimental conditions for almost all the characteristics we examined (Table 2). Inconsistent with H4, offering Spanish-language questionnaires did not bring in more Hispanic respondents or more respondents who speak a language other than English at home. Spanish-language materials also did not affect the distribution for sex, education, employment status, or marital status. As expected, however, offering Spanish brought in a higher proportion of respondents aged 19–44 than in the English-only condition, in partial support of H5. The average absolute deviation from the American Community Survey benchmark values across the respondent characteristics below was almost identical across the experimental conditions.
With only 15 Spanish-language respondents, we cannot report much about their characteristics. Notably, 100% of the respondents who answered a Spanish-language questionnaire identified themselves as Hispanic, and 100% indicated that Spanish was the language spoken at home.
Discussion
Recruiting Spanish speakers is a common challenge in self-administered surveys. We found that adding additional Spanish materials did not increase the response rate, and instead showed a “backfire” effect. The added Spanish version may have increased the perceived burden of the questionnaire (the envelope had two questionnaires instead of one) or decreased its perceived relevance for those who did not speak Spanish. We also did not see any gains on response rates from adding a Spanish-language questionnaire for addresses that may be more likely to speak Spanish. We also did not bring in more Hispanic respondents or respondents who reported that their primary language at home was not English.
Despite this deleterious effect on response rates, the Spanish-language questionnaires that were returned came from addresses with Hispanic surnames and in areas with higher concentrations of Hispanic adults. In fact, although only 2% of the Spanish-and-English condition participated in Spanish overall, one-third of the respondents who were anticipated to have a Hispanic surname on the sample frame participated in Spanish when offered a Spanish-language questionnaire, and largely by mail. Thus, the added costs of printing paper Spanish surveys rather than simply programming a Spanish language option on the web survey are important for inclusion of Spanish-speaking adults. Resource-constrained self-administered studies could target a Spanish paper questionnaire only to those addresses that are likely to have a resident with a Hispanic surname. In Nebraska, only 1.5% of Nebraska households are “limited English-speaking households” who also speak Spanish, including only 3.5% in the study area (U.S. Census Bureau 2022c). To the extent that the Spanish-language questionnaire was primarily attractive to these households, then the yield rate of offering Spanish-language questionnaires in Nebraska will generally be low.
All studies have limitations. The questionnaire topic—employment—may be sensitive or difficult for those who are employed in more transient jobs. Literacy is a barrier for self-administered questionnaires; navigating a complex questionnaire may be particularly difficult for lower literacy adults in any language. Translation itself may be a necessary, but not sufficient, step for communicating legitimacy and trust in a survey request for those who speak languages other than English. Future work could examine augmenting these modes with interviewer-administered follow-up attempts by telephone to overcome potential literacy and trust issues. We did not offer an incentive in this study, a method often used to increase participation rates. Future work could evaluate whether incentives would help amplify participation rates in Spanish-language questionnaires. This experiment was conducted in the central part of Nebraska, a largely rural area. Future research should replicate this experiment to include urban contexts. Finally, future research could explore other strategies to balance survey costs and inclusive practices with respect to language in survey designs, including whether shorter or simplified questionnaires may help with increasing participation decisions. Simply translating the questionnaire and materials was not enough to include diverse linguistic communities in this survey. Tackling these challenges will help uphold representation of diverse language groups, enhancing the integrity and generalizability of future studies.
Consistent with recent past research, this experiment demonstrated a backfire effect on response rates from including a Spanish-language questionnaire in a mixed-mode web-and-mail survey and yielded few Spanish-language responses. But those who did participate in Spanish were notably different from those who participated in English in Hispanic identity, language spoken at home, and area of residence. These results reinforce the importance of targeting Spanish questionnaires to those who are most likely to use them, but also highlight that translation itself is not sufficient for recruitment of diverse populations.
Acknowledgements
The authors thank Brandon Jones at the Nebraska Department of Labor for permission to conduct this experiment. A previous version of this paper was presented at the 2023 Midwest Association for Public Opinion Research annual conference.
Funding
Funding for data collection came from the Nebraska Department of Labor under contract NDOL Agreement # UNL-00056753.
Lead author’s contact information
Kristen Olson,
Mail Address: 703 Oldfather Hall, Lincoln NE 68588-0324
Email: kolson5@unl.edu