Population health surveillance is an essential core function of public health (Teutsch and Churchill 2000). Ascertaining reliable and valid assessments of population health status is paramount to avoiding critical delays in the development of intervention programs and offsetting preventable chronic health problems (morbidity) and premature mortality. Furthermore, trends in general—or overall—population health and disease may disguise disparities that exist between demographically distinct subpopulations defined by age, sex, and race/ethnicity which, in turn, can produce disparities in the timely response to public health crises. Two predominant sources of drug use surveillance data in the United States (U.S.) are the National Survey on Drug Use and Health (NSDUH), an annual, cross-sectional survey of randomly selected individuals ages 12 and older and the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), a longitudinal survey consisting of three waves of data collection (NESARC-I: 2001–2002, NESARC-II: 2004–2005, and NESARC-III: 2012–2013). Both surveys utilize a multi-stage sampling design with socio-demographic stratification and provide analysts with post-stratification weights to ensure that the samples were representative of the U.S. population. The resulting data represent the civilian, noninstitutionalized population and are used to generate prevalence estimates of substance use along with behavioral, mental, and general health correlates. Although sampling methods differ in a number of ways, the weighting procedures drawing on the 2010 U.S. Census should counteract the sampling differences (Grucza et al. 2007). While both surveys are intended to produce nationally representative estimates, there are important methodological considerations that could influence results.

The NESARC-III was sponsored by the National Institute on Alcohol Abuse and Alcoholism. Prior to the main study, a field test was implemented to refine the protocol, instruments, materials, and procedures for the main study. Primary sampling units (PSUs) are individual counties or groups of contiguous counties; secondary sampling units are groups of U.S. Census–defined blocks; and tertiary sampling units are households within the secondary sampling units. From the more than 3100 counties in the United States, the final number of PSUs created for NESARC-III was 2349. Finally, eligible adults within sampled households are randomly selected. Hispanic, Black, and Asian households were oversampled, and in households with at least 4 eligible individuals who were ethnic or racial minorities, 2 respondents are selected. Prior to arrival, an advance letter was sent to prospective households. Approximately 1000 trained interviewers carried out the main study procedures from April 2012 to June 2013. Interviewers utilized a variety of “plain language” materials to assist in the recruitment and data collection process which included brochures, nonresponse letters, language identification cards, and flashcard booklets. A standalone computer assisted personal interviewing (CAPI) system was used to ascertain demographic information prior to collecting consent to participate and was appended into the final interview data. Afterwards, participants were provided a $45 incentive for participating. Following the interview, a second $45 incentive was provided for completing the study. Participants also provided contact information for quality control purposes. Each participant was provided questions through the CAPI software about background and lifestyle, such as age and education; drinking practices; and related mood, anxiety, behavior, personality, using the Alcohol Use Disorder and Associated Disabilities Interview Schedule 5 (AUDADIS-5) modules. Following the interview, participants who consented provided a saliva sample.

The NSDUH is sponsored by the Center for Behavioral Health Statistics and Quality office within the Substance Abuse and Mental Health Services Administration. PSUs comprised approximately 500,000 area segments (groups of adjacent census blocks). The first stage of sampling involved selection of eight such segments from each of 900 geographic ‘field interviewer’ (FI) regions. The frames for the second stage of sampling consist of lists of all dwelling units within segment boundaries. Samples of dwelling units were selected from these lists. Individuals were selected from rosters obtained by dwelling unit visits. Approximately 700 field interviewers visited homes and collected data from participants. All NSDUH surveys conducted after 1999 utilized a Computer Assisted Interview (CAI) methodology comprising of a core and supplement structure. The core set of questions remain constant from year to year on demographic items and questions pertaining to the use of tobacco, alcohol, marijuana, cocaine, crack cocaine, heroin, hallucinogens, inhalants, pain relievers, tranquilizers, stimulants, and sedatives. The supplement questions can be revised, dropped, or modified from year to year and some have remained constant since their initial use (e.g., health insurance coverage). Responses to sensitive questions were collected using automated computer assisted self-interviewing (ACASI) where participants listened to prerecorded questions through headphones and entered responses directly without assistance of the interviewer. Participants received a $30 incentive for completing the study procedures.

Grucza et al. (2007) previously compared estimates from the 2002 NSDUH with estimates from the wave 1 NESARC administered in 2001–2002. The authors concluded that prevalence estimates for all substance use outcomes (lifetime, past year, and substance use disorder) were higher in the NSDUH data than the NESARC data in the general population of U.S. adults ages 18 and older. Models adjusting for sex, age, and race/ethnicity suggested differences in estimates existed (.01 < P < .05) but were not thoroughly described. Additionally, it was unclear if unstable estimates were suppressed where necessary according to the NSDUH documentation (Substance Abuse and Mental Health Services Administration 2013). Since then, several studies have uncovered potential differences in substance use estimates across surveys and subpopulations. For example, Ryan et al. (2012) documented differences in current smoking estimates between the NSDUH and NHIS (National Household Interview Survey) with greater sensitivity for differences among Hispanics relative to other groups, and NSDUH estimates being higher, in general. In fact, comparisons using the NSDUH surveys continually demonstrate these data present higher estimates of substance use behavior and mental health indicators in the general population (Grucza et al. 2007; Hedden et al. 2012). Pemberton et al. (2013) identified discrepanices in estimates of health status and healthcare utilization by age, gender, and race/ethinicity; however, they did not speculate as to why subgroups differed. Only one study (Ryan et al. 2012), comparing NSDUH and National Health Interview Study (NHIS) estimates on current and daily cigarette smoking identified Hispanic respondents as to have the greatest differences in estimates. The authors speculated that Hispanics may be the most sensitive to differences in smoking variable definitions. No study to date has compared substance use estimates between the NESARC-III and comparable NSDUH survey by demographic subpopulations defined by age, sex, and race/ethnicity.

Uncovering differences in substance use prevalence among demographic subpopulations has important implications for policymaking, prevention, intervention, and treatment programming. The current opioid epidemic and accompanying trends of increasing “deaths of despair” underscore the importance of obtaining accurate prevalence estimates among demographic subpopulations (Case and Deaton 2015; Monnat et al. 2019). Without accurate estimates of substance use prevalence and the capability to detect trends of increasing prevalence, it is impossible to recognize population level problems and intervene before increases in mortality occur. Further, uncovering disparities within racial/ethnic groups is important for furthering our understanding of phenomena such as the “Hispanic paradox”—lower mortality risk for Hispanic adults relative to their non-Hispanic White peers (Fenelon 2013). Considering the important role that smoking-attributable mortality plays in the Hispanic paradox (Fenelon 2013) and previous research noting potential discrepancies in cigarette use (Ryan et al. 2012) among Hispanic participants in nationally representative surveys constitutes a continued need to monitor and perform comparative analyses across population studies to uncover important disparities that can impact our ability to conduct rigorous health assessments and draw valid conclusions discerning differences in health outcomes across demographic groups.

The aim of this analysis is to compare prevalence estimates of past year substance use across these national surveys by age, sex, and race/ethnicity. Although precise agreement between data sources is not expected due to differences in purpose and methodology of employing these data collection measures, uncovering a lack of agreement can better inform future waves of data collection to remediate known methodological issues contributing to observed differences. Using epidemiological methods to identify the strengths and weaknesses of different surveillance tools will assist public health professionals in providing a comprehensive picture of physical, mental, behavioral health and related indicators in the United States.

Methods

Sample

This analysis was completed using secondary, publicly accessible data for the NESARC and NSDUH surveys available at https://www.niaaa.nih.gov/research/nesarc-iii/nesarc-iii-data-access and https://www.datafiles.samhsa.gov/study/national-survey-drug-use-and-health-nsduh-2012-nid13601, respectively. The total sample size of the NESARC-III survey was 36,309 respondents. The screener- and person-level response rates were 72.0% and 84.0%, respectively, yielding a total NESARC-III response rate of 60.1%. In the NSDUH data, adolescents and young adults were oversampled, with one-third of the sample in each of three age groups: 12–17, 18–25 and 26+. The total sample size for the 2012 NSDUH survey was 55,268 and the weighted overall response rate was 73.0%. While the NSDUH survey includes civilian participants ages 12 and older, the NESARC survey was only administered to adults (ages 18 and older). As such, NSDUH participants ages 12–17 (n = 17,399) were excluded from this analysis. Complete sampling procedures are detailed elsewhere for the NESARC (Grant et al. 2014) and NSDUH surveys (Substance Abuse and Mental Health Services Administration 2013).

Measures

Comparable survey items were identified between the NESARC and NSDUH surveys on past year substance use, age, biological sex, and race/ethnicity. Data were recoded where necessary to estimate past year prevalence of painkiller misuse, alcohol, cigarette, and marijuana use (Supp Table 1). Some discrepancies exist in the determination of participant gender and item wording that could contribute to differences in results. In the NSDUH, the interviewer is prompted to record the respondent’s gender and confirm with the respondent whereas the NESARC survey asks the respondent to identify their sex as male or female. Participants may differentiate sex (gender assigned at birth) and gender (self-identified sex that may or may not correspond to gender assigned at birth). For cigarette use, the NESARC survey asks if respondents had at least 1 cigarette while the NSDUH allows for respondents to consider smoking part or a whole cigarette. Finally, the item assessing painkiller use in the NESARC survey is not directly presented as recreational, or non-medical use, whereas the NSDUH survey specifically asks about using a painkiller “that was not prescribed for you or that you took only for the experience or feeling it caused?”.

Analysis

Using the SVY command (Stata version 14), estimates were computed for the total population under study in both surveys. Then, estimates were computed by biological sex (male, female), race/ethnicity (non-Hispanic White, non-Hispanic Black, Hispanic, Other), and age group (18–25, 26–34, 35–49, 50+) producing 32 estimates per substance per survey (see Supp Table 2 for unweighted sample sizes). Estimates were suppressed for the NSDUH survey according to the NSDUH methodological documentation (Substance Abuse and Mental Health Services Administration 2013). Finally, the two-proportion z-test was used to compare differences in estimates between the NSDUH and NESARC surveys with a significance level of α = 0.05. Estimates (weighted population proportions) with 95% confidence intervals are presented in the figures.

Results

First, estimates of alcohol, cigarette, and marijuana use were compared across the total population. Prevalence estimates were significantly higher (P’s < 0.001) in the NSDUH survey compared to the NESARC survey for use of cigarettes (NESARC: .235, 95% CI .227–.243; NSDUH: .276, 95% CI .269–.283), marijuana (NESARC: .095, 95% CI .090–.101; NSDUH: .121, 95% CI .116–.126), and painkillers (NESARC: .041, 95% CI .038–.044; NSDUH: .047, 95% CI .044–.051), but not for alcohol (NESARC: .727, 95% CI .715–.739; NSDUH: .709, 95% CI .699–.717).

Past Year Alcohol Use Prevalence by Race/Ethnicity, Sex, and Age

Among non-Hispanic Whites, there were no significant differences in estimates of past year alcohol use prevalence by biological sex or age group. However, for African Americans, NESARC estimates were higher for males ages 50 and older (NESARC: .64; NSDUH: .56). Most discrepancies among estimates were observed for Hispanic respondents. For both males and females, NESARC estimates were significantly higher for those 26–34 years old (Males NESARC: .85; Males NSDUH: .78; Females NESARC: .71; Females NSDUH: .66), males 35–49 years old (NESARC: .79; NSDUH: .74) and females 50 years and older (NESARC: .52; NSDUH: .32) (Fig. 1).

Fig. 1
figure 1

Comparisons of past year alcohol use prevalence estimates by race/ethnicity, gender, and age between NESARC (white circles) and NSDUH (black diamonds) surveys with 95% confidence intervals. Black diamonds are missing where NSDUH estimates were suppressed. Significant differences (P < 0.05) in estimates indicated by asterisk

Past Year Cigarette Use Prevalence by Race/Ethnicity, Sex, and Age

Figure 2 shows that of the 23 estimates produced (nine suppressed in NSDUH), 56.5% (13/23) were discrepant with NSDUH estimates significantly larger in all cases. Among non-Hispanic Whites, estimates for males and females differed for the 18–25 year-old age category (Males NESARC: .33; Males NSDUH: .52; Females NESARC: .31; Females NSDUH: .41), the 26–34 year-old age category (Males NESARC: .40; Males NSDUH: .51; Females NESARC: .34; Females NSDUH: .39), and the 35–49 year-old age category (Males NESARC: .29; Males NSDUH: .35; Females NESARC: .28; Females NSDUH: .31), but not for the 50+ year-old category. For African Americans, NSDUH estimates were higher across sex for the 18–25 year-old age category only (Males NESARC: .22; Males NSDUH: .38; Females NESARC: .16; Females NSDUH: .28). With the exception of the 50+ age category, NSDUH estimates were larger among Hispanic males (18–25 age category NESARC: .22; NSDUH: .43; 26–34 age category NESARC: .27; NSDUH: .34; 35–49 age category NESARC: .23; NSDUH: .29). For Hispanic females, NSDUH estimates were significantly higher for the 18–25 year-old age category (NESARC: .13; NSDUH: 0.29) Non-Hispanic Other males ages 18–25 also showed discrepant estimates (NESARC: .16, NSDUH: .39).

Fig. 2
figure 2

Comparisons of past year cigarette use prevalence estimates by race/ethnicity, gender, and age between NESARC (white circles) and NSDUH (black diamonds) surveys with 95% confidence intervals. Black diamonds are missing where NSDUH estimates were suppressed. Significant differences (P < 0.05) in estimates indicated by asterisk

Past Year Marijuana Use Prevalence by Race/Ethnicity, Sex, and Age

Past year marijuana use prevalence in the NSDUH was higher for all age categories among non-Hispanic White males (18–25 age category NESARC: .30; NSDUH: .38; 26–34 age category NESARC: .21; NSDUH: .28; 35–49 age category NESARC: .11; NSDUH: .15; 50+ age category NESARC: .05; NSDUH: .07), but among non-Hispanic White females only the 18–25 year-old age category (NESARC: .21; NSDUH: .29) and the 26–34 year-old age category (NESARC: .11; NSDUH: .14) varied significantly. Similar to the finding for prevalence of past year cigarette use, NSDUH estimates were higher across sex for the 18–25 year-old age category only for African Americans (Males NESARC: .31; Males NSDUH: .37; Females NESARC: .19; Females NSDUH: .31). Hispanic males ages 18–25 years old (NESARC: .22; NSDUH: .31) and Hispanic females in all age categories except the 50+ category (18–25 age category NESARC: .16; NSDUH: .21; 26–34 age category NESARC: .07; NSDUH: .11; 35–49 age category NESARC: .03; NSDUH: .05) evidenced significantly higher rates of past year marijuana use in the NSDUH survey (Fig. 3).

Fig. 3
figure 3

Comparisons of past year marijuana use prevalence estimates by race/ethnicity, gender, and age between NESARC (white circles) and NSDUH (black diamonds) surveys with 95% confidence intervals. Black diamonds are missing where NSDUH estimates were suppressed. Significant differences (P < 0.05) in estimates indicated by asterisk

Past Year Non-medical Painkiller Use Prevalence by Race/Ethnicity, Sex, and Age

As seen in Fig. 4, of the 9 discrepant estimates for past year non-medical painkiller use, the majority of NSDUH estimates were higher. For non-Hispanic Whites, NSDUH estimates were higher for males ages 18–25 years old (NESARC: .07; NSDUH: .14) and 26–34 years old (NESARC: .07; NSDUH: .11) as well as females ages 18–25 years old (NESARC: .05; NSDUH: .10); however, the NESARC estimate was higher for females in the 50+ category (NESARC: .031; NSDUH: .015). For African-Americans, a discrepancy was observed for females ages 18–25 years old (NESARC: .06; NSDUH: .09). Lastly, Hispanic males in all age groups except the 50+ category (18–25 age category NESARC: .05; NSDUH: .07; 26–34 age category NESARC: .04; NSDUH: .07; 35–49 age category NESARC: .02; NSDUH: .05) and females in the 26–34 year-old age category (NESARC: .03; NSDUH: .07) had significantly higher prevalence rates in the NSDUH survey.

Fig. 4
figure 4

Comparisons of past year non-medical painkiller use prevalence estimates by race/ethnicity, gender, and age between NESARC (white circles) and NSDUH (black diamonds) surveys with 95% confidence intervals. Black diamonds are missing where NSDUH estimates were suppressed. Significant differences (P < 0.05) in estimates indicated by asterisk

Overall, most significant differences in estimate comparisons were observed across ethnicity and sex for all substances among respondents 18–25 years old (17/28 comparisons, 60.7%) and 26–34 years old (11/22, 50%). Significant differences in comparisons, across substances, age groups, and sex were highest among Hispanics (16/27, 59.3%), followed by non-Hispanic Whites (16/32, 50%), individuals identified as Other race (1/4, 25%), and non-Hispanic Blacks (6/30, 20%). Very few estimates were produced for the Non-Hispanic Other subgroup across sex in the NSDUH when implementing the suppression criteria suggesting sample sizes were insufficient for deriving reliable estimates for these groups. In general, estimates were higher for the NSDUH survey, but patterns of substance use prevalence by race/ethnicity, age, and sex were similar across demographic subgroups across surveys.

Discussion

This analysis contributes to the well-documented discrepancies in reports of health and behavior outcomes using surveillance tools at the general population level (Borgo et al. 2019; Hall et al. 2012; Lewycka et al. 2019, January; Nelson et al. 2003; Ryan et al. 2012; Singleton et al. 2019), and extends previous findings by documenting differences in estimates across demographic subgroups. Overall, we observed similar patterns of substance use prevalence by race/ethnicity, sex, and age between the NSDUH and NESARC surveys. However, there were several notable discrepancies in prevalence of past year substance use prevalence across subgroups of respondents in the NSDUH and NESARC surveys that likely drove the general discrepancies observed at the general population level in previous work (Grucza et al. 2007).

While response rates and sampling procedures were similar across the two assessments, there are notable differences in the methodology of the NSDUH and NESARC that could have important ramifications on the results presented here. First, younger respondents and non-Hispanic Whites were oversampled in the NSDUH survey whereas minority households were oversampled in the NESARC. As a result, observed differences in substance use estimates were greater in ethnic minority groups.

Second, person level factors, such as generational or immigration status, may also contribute to the observed discrepancies. According to the NESARC data, approximately 17.63% of respondents reported being born outside of the U.S. However, data on generational and immigrant status were not collected in the NSDUH survey which did not allow us to investigate this factor as an underlying cause of the differences we reported. Given that immigrant status may preclude respondents from answering sensitive questions that could be perceived to jeopardize their U.S. residency, this is important data to collect across nationally representative surveys. In the present study, it is possible that the discrepancies between Hispanic and non-Hispanic subpopulations could be explained by intra-ethnic differences. For instance, Mexican–American respondents may have been weary to report engaging in the use of illicit drug use or illicit behaviors as they might fear retaliation towards themselves or their families whereas Cuban-Americans (whose residency status is supported by the Cuban Adjustment Act) may not share the same fear or hesitation. In addition, the ability to disaggregate Hispanics into more homogeneous sub-groups of shared ethnic descent, such as Mexican Americans, Cuban Americans, Puerto Rican Americans, and so forth, has important implications for better understanding the role of substance use in other health phenomena like the well-documented Hispanic Paradox (Fenelon 2013; Lariscy et al. 2015; Fishman et al. 2018) and racial/ethnic disparities among “deaths of despair” (Case and Deaton 2015).

Third, the majority of past year substance use prevalence estimates were suppressed in the NSDUH for respondents classified as a race/ethnicity other than non-Hispanic White, non-Hispanic Black, or Hispanic of any race aside for males ages 18–25. The underrepresentation of racial/ethnic minority groups in U.S. national surveys is problematic especially given that Asian-Americans, Native Americans/Alaskan Natives, and Hawaiians/Pacific Islanders make up nearly 8% of the U.S. population (U.S. Census Bureau 2019), and might face unique substance related problems that are not being captured in nationally representative surveys limiting the ability of researchers to tailor programs to address the needs of these populations (Jernigan et al. 2018; Maxwell et al. 2012). Ideally, experimental methods could be used to ascertain which differences in survey methodology might contribute to these observed differences (Grucza et al. 2007); however, there are other potential strategies that could be implemented to ameliorate procedural elements that could influence responses to self-report measures of substance use and related outcomes. These methodological weaknesses contribute to a persistent lack of understanding of both age-based and racial/ethnic-based substance use disparities and their links to subsequent health outcomes.

We recommend the following strategies to increase the accuracy, validity, and representativeness of data used to estimate population level substance use prevalence. First, use of a standardized survey protocol across surveillance indices can reduce the chance of misinterpretation of questions or response choices. For example, the use of the PhenX toolkit can provide consistent measures across surveys to ensure high quality, well-established, reproducible, broadly applicable, and low burden items for participants and data collectors (Hamilton et al. 2011). Second, future data collection efforts should collect data on immigration status to enrich our understanding of immigrant health behaviors and outcomes relative to U.S. born counterparts. Finally, we strongly recommend increasing the sample size of historically underrepresented racial/ethnic minority groups in order to derive more reliable estimates and support efforts to track substance use in underrepresented groups including, for example, American Indians/Alaska Natives, Asian Americans, and citizens of Middle East descent. This is in line with recommendations by the U.S. Department of Health and Human Services (U.S. DHHS) to create a set of uniform data collection standards for inclusion in surveys conducted or sponsored by DHHS (U.S. Department of Health and Human Services 2015). Increasing the diversity of respondent pools will also allow researchers to conduct more thorough analyses by describing both inter-ethnic and intra-ethnic differences in substance use and health behavior estimates.

Surveillance is a critical component for effective public health action with the reliability and validity of the data in those systems being paramount for identifying signals that prompt a public health response before large-scale mortality increases occur, such as the current “deaths of despair” phenomenon (Case and Deaton 2015; Woolf and Schoomaker 2019), which may be further exacerbated by the COVID-19 pandemic (Faust et al. 2020). As elaborated in Groseclose and Buckeridge (2017), we concur with the sentiment that surveillance systems should continue to be evaluated to assess their accuracy, efficiency, and opportunity to contribute to public health goals. This descriptive evaluation of substance use estimates contributes to ongoing work that supports building a surveillance infrastructure that can effectively characterize dynamic and heterogeneous populations, such as the U.S., to respond to crises in a proactive rather than reactive manner.

Limitations

First, without event level data or similar measures of more recent substance use (e.g., past month) between surveys we are unable to differentiate between current, past, and ever (i.e., experimental) users or derive accurate assessments of quantity of use. Second, as this was a secondary analysis of existing data, the parameters used to define age, sex, and race/ethnicity may not represent developmentally or culturally meaningful subgroups. Third, use of the suppression criteria for NSDUH estimates resulted in few estimates for the “Other” race category. Future surveys should attempt to consistently oversample minority populations to obtain substance use prevalence for traditionally underrepresented minority groups, and include items on immigrant and generational status. Fourth, these samples represent the general, non-institutionalized U.S. population and cannot be used to derive estimates for incarcerated persons or active duty military personnel although these groups may be at-risk for substance use behavior and require special attention (Albright et al. 2019; Newbury-Birch et al. 2018). Fifth, we excluded data from participants ages 12–17 collected as part of the NSDUH survey as there was no comparison group from the NESARC study. Given the recent public health concern around escalating e-cigarette use in this age group (Berry et al. 2019; Soneji et al. 2017), and unique characteristics that contribute to substance use initiation and escalation in this developmental period (Brook et al. 2016; Grigsby et al. 2016; Soneji et al. 2017) there is a need to integrate adolescent substance use—and general health—behaviors into all existing nationally representative surveillance surveys.

Health professionals and researchers should be cautious in using the NSDUH and NESARC surveys to derive point estimates of substance use prevalence in the general population or by demographic characteristics such as race/ethnicity, age, and biological sex. If prevalence estimates are derived from these surveys, confidence intervals should be presented and estimates from two or more surveys should be used to present the potential variability that exists in estimating the true prevalence of any substance use behavior. Given that patterns of substance use prevalence were similar across surveys, conclusions regarding differences in substance use between groups (i.e., that subpopulations have higher or lower rates of substance use relative to others) can still be presented with relative confidence. Population health surveillance tools like the NSDUH and NESARC surveys remain valuable assets in tracking the health and well-being of the U.S. population, and with further refinement they can serve an invaluable role in tackling the ongoing addiction crisis and assist public health professionals in identifying potentially harmful trends that warrant a proactive response to maximize the public health impact of preventive interventions.