Alcohol use is a leading cause of preventable premature death worldwide, accounting for some 3 million or 5.3% of premature global deaths (World Health Organization Team et al., 2018). Almost half of alcohol attributable to disability-adjusted life years (DALYs) are due to non-communicable diseases and mental health conditions, and about 40% are due to injuries (World Health Organization Team et al., 2018). These harms are related to both the overall volume of alcohol consumption and the pattern of drinking (Rehm et al., 2009, 2010). Developing an easy-to-use clinical tool to predict adverse outcomes of alcohol consumption (i.e., a risk calculator) among non-alcohol dependent individuals is crucial to help refine prevention strategies and promote early detection and treatment of high-risk individuals (Birmaher et al., 2018; Cannon et al., 2016; Fusar-Poli et al., 2017).

Robust evidence supports the effectiveness of screening and brief alcohol interventions in healthcare settings (Ballesteros et al., 2004; Kaner et al., 2013). While the content and delivery style of brief interventions varies (Whitlock et al., 2004), their success relies in part on their ability to promote awareness of the negative effects of drinking and to motivate positive behavior change (Ballesteros et al., 2004; Rolland et al., 2017). Most brief intervention guidelines recommend the use of the short-form 3-item AUDIT-C questionnaire for the screening of alcohol misuse (i.e., hazardous use or alcohol use disorders) (Bush, 1998; Rubinsky et al., 2013). AUDIT-C cut-off scores for alcohol misuse screening in men and women are used for discussing physicians’ concerns and for recommending reductions in alcohol consumption to the patient. However, adoption of those interventions is limited because, at present, those cut-offs do not allow (i) to determine precisely individuals’ risks of adverse outcomes, which may be differentially influenced by both the overall volume of alcohol consumption and the pattern of drinking (Rehm et al., 2010), and (ii) to deliver truly personalized information based on their specific drinking behavior. For example, although psychological, social, and medical consequences of hazardous alcohol drinking are often known (McCambridge, 2013), they often have low salience, especially for people with a focus on immediate rewards, such as individuals with substance use disorders. In addition, individuals with hazardous alcohol use tend to underestimate the short-term social consequences of their alcohol use (Grosso et al., 2013; Mallett et al., 2008; Oleski et al., 2010). Developing a clinician-friendly risk calculator for important medical, psychological, and social risks associated with different patterns of consumption within a timeframe relevant to most drinkers could help care providers deliver more personalized and effective feedback (Dotson et al., 2015).

Prior research suggests that the dimensions of alcohol consumption assessed by the AUDIT-C (i.e., past-year drinking frequency, average number of drinks consumed when drinking alcohol, and heavy drinking frequency) are significantly associated with alcohol-related adverse outcomes, including AUD severity (Rubinsky et al., 2013), depression (Levola & Aalto, 2015), suicide attempt (Hoertel et al., 2018), post-operative complications (Bradley et al., 2011), trauma (Williams et al., 2012), and social consequences such as divorce/separation, revocation of driving license or social isolation (Begg et al., 2017; Blanco et al., 2021, 2023; Franco et al., 2019; Hoertel et al., 2014a, b, c; Hoertel et al., 2014a, b, c). However, these prior studies have not examined the predictive power of the AUDIT-C scores for these outcomes. Thus, the predictions may yield many false positives and negatives, particularly because the relationship between AUDIT-C scores and adverse outcomes may vary by sex (Levola & Aalto, 2015; Rubinsky et al., 2013) and age (Lapham et al., 2014). In addition, the optimal AUDIT-C cut-off score may vary by adverse outcome.

In this study, we present the development and testing of a risk calculator, using composite scoring systems, to predict several important incident alcohol-related adverse outcomes (i.e., alcohol use disorder, interpersonal relationship problems, withdrawal symptoms, legal problems, psychological problems, and the occurrence of tremors or seizures) among non-alcohol-dependent individuals consuming alcohol, using a longitudinal nationally representative sample, the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). Composite scoring systems allow combining information derived from several risk factors and aim at quantifying the risk for each subject (Coste et al., 1997). To simplify the scoring and facilitate its use, we scaled and rounded the regression coefficients in the final model to the nearest integer (Moons et al., 2002). By using a large national sample, we sought to obtain stable estimates that could be generalized beyond clinical samples.

Methods

Sample

Data were drawn from wave 1 and wave 2 of the NESARC, a nationally representative face-to-face survey of the US adult population, conducted in 2001–2002 (wave 1) and 2004–2005 (wave 2) by the National Institute on Alcoholism and Alcohol Abuse (NIAAA) (Grant et al., 2009) with 43,093 participants (Grant et al., 2009). The target population included the civilian noninstitutionalized population, aged 18 years and older, residing in the USA. The overall response rate at wave 2 was 70.2%, resulting in 34,653 wave 2 interviews (Grant et al., 2009). The wave 2 NESARC data were weighted to adjust for non-response, demographic factors and psychiatric diagnoses, to ensure that the wave 2 sample approximated the target population. The research protocol, including written informed consent procedures, received full human subjects review and approval from the US Census Bureau and the Office of Management and Budget (Canino et al., 1999).

This analysis includes the 16,710 participants who participated to both waves, had consumed alcohol during the year preceding the wave 1 interview and did not have a lifetime history of alcohol use disorder (eFigure 1). We randomly split this sample of 16,710 participants into two samples: a development sample (N = 8355) and a validation sample (N = 8355).

Assessment of the Three-Year Risk of Incident Alcohol-Related Adverse Outcomes Between the Two Waves

Alcohol use disorder (AUD), i.e., alcohol abuse or dependence, was diagnosed using the Alcohol Use Disorder and Associated Disabilities Interview Schedule-DSM-IV-TR version (AUDADIS-IV), a structured diagnostic instrument administered by trained lay interviewers (Grant et al., 2009). The test–retest reliability and validity of the AUDADIS-IV measures of AUD were good (Cohen’s kappa = 0.74) (Canino et al., 1999; Grant et al., 2003). Interpersonal relationship problems, withdrawal symptoms, legal problems, psychological problems, and the occurrence of tremors or seizures were assessed in wave 2, as detailed in eTable 1. Only incident adverse outcomes between the two waves were considered in all analyses.

Predictors: the 3 AUDIT-C Items, Sex, and Age

AUDIT-C is a validated brief screening scale to identify individuals with alcohol use disorder appropriate for routine screening in primary care (Bush, 1998; Rubinsky et al., 2013). The operationalization of the 3 items of the AUDIT-C (i.e., past-year drinking frequency, average drinks consumed when drinking alcohol, and heavy drinking frequency) is shown in eTable 2. To reduce the risk of multicollinearity across these 3 items, we collapsed a priori heavy drinking frequency into 3 categories: never, less than monthly, and at least monthly. Sex and age were self-reported. Age was categorized into 5 classes, i.e., 18–30 years, 31–40 years, 41–50 years, 51–64 years, and 65 years or more. We used the same set of predictors for each outcome (eTable 3).

Statistical Analyses

This analysis includes the 16,710 participants who participated to both waves, had consumed alcohol during the year preceding the wave 1 interview and did not have a lifetime history of alcohol use disorder (eFigure 1). We randomly split this sample of 16,710 participants into two samples: a development sample (N = 8355) and a validation sample (N = 8355) (eFigure 1). There were no between-sample significant differences in AUDIT-C items, sex, and age (eTable 3). We used the same set of predictors for each outcome.

Risk Calculator Development

First, we examined the bivariate relationships of the 3 items of AUDIT-C, sex, and age with the risk of developing each outcome in a 3-year follow-up period in the development sample. The risk calculator was developed using stepwise regression modeling. That procedure was used to select significant (i.e., p value < 0.05) explanatory variables for each outcome. To develop the risk calculator, we used combined procedures of forward and backward selection. Next, we performed composite scoring systems to combine information derived from these predictors and quantify the risks of alcohol-related incident adverse outcomes for each subject.

Discrimination and calibration of the risk calculator were examined for each outcome using Harrel’s C index, which reflects the proportion of all possible pairs of participants, one with the event and one without it, that are concordant (Steyerberg et al., 2001). The C index is equal to the area under a “receiver operating characteristic” curve. A C index of 0.5 indicates that the model is not better than chance at predicting an outcome. A C index ≥ 0.7 indicates a good model and a C index ≥ 0.8 a very good model (Steyerberg et al., 2010). Calibration measures the agreement between observed risk and predicted risk. We assessed the calibration of the calculator using the Hosmer and Lemeshow (H–L) test (Hosmer et al., 1997). The H–L test calculates if the observed event rates match the expected event rates in population subgroups. Models for which expected and observed event rates in subgroups are similar are called well calibrated. The H–L test is obtained by calculating the Pearson chi-square statistic from the 2 × g table of observed and expected frequencies, where g = 10 is the number of groups formed by deciles of risk. This statistic has an asymptotic χ2 (g — 2) distribution. P values ≤ 0.05 indicate poor calibration and larger p values (> 0.05 and > 0.5) indicate adequate and good calibration, respectively (Steyerberg et al., 2001).

To simplify the scoring and facilitate its use, especially in cases of limited access to computers or internet, the regression coefficients were scaled and rounded to integers on a range from 0 to 20 (Moons et al., 2002), so that risks can be calculated simply by hand. We subsequently checked that the resulting scaled/rounded to integers coefficients provided discrimination and calibration similar to those provided by the original coefficients. For each outcome, the risk calculator provided both the absolute risk and the risk ratio compared with individuals having the lowest alcohol consumption in wave 1.

All analyses accounted for the complex sampling design of the NESARC (Lumley, 2004) and were performed with R software version 4.3.1 (R Core Team, 2022). We evaluated statistical significance using a two-sided design with alpha set a priori at 0.05.

Risk Calculator Validation

We applied the risk equations derived from the development sample to the validation sample and calculated C indices for discrimination and the H–L test for calibration.

We also examined whether the predictive value (i.e., C index and its standard error) of the risk calculator for each outcome was significantly better than that obtained when simply using the AUDIT-C global score with different binary thresholds as predictor in the validation sample (DeLong et al., 1988; Hanley & McNeil, 1982; R Core Team, 2022).

Results

Among the 8355 participants from the development sample who consumed alcohol in the year preceding the wave 1 interview and had no lifetime history of AUD. At the time of the wave 2 interview, 6.0% (N = 498) had an incident diagnosis of AUD, 2.1% (N = 177) reported incident interpersonal relationship problems, 393 (4.7%) withdrawal symptoms, 1.3% (N = 107) incident legal problems, 2.5% (N = 208) incident psychological problems, and 1.7% (N = 149) incident tremors or seizures during the 3-year follow-up period. AUDIT-C items, sex, and age were significantly associated with all outcomes, except sex with withdrawal symptoms and interpersonal problems, age with tremors or seizures, and heavy drinking frequency with alcohol use disorder (eTables 4 and 5).

The C indices and the H–L tests for the risk calculator of each outcome in the development sample are shown in Table 1, while ROC curves are shown in eFigure 2. The calculator demonstrated good to very good discriminant power, with C indices ranging from 0.756 (alcohol use disorder) to 0.857 (legal problems), and adequate calibration with H–L test p values ranging from 0.153 (alcohol use disorder) to 0.678 (legal problems). There was no substantial loss of discrimination due to the use of scaled/rounded coefficients (Table 1). When applying the calculator to the validation sample (N = 8,355), our results indicated that they had a good to very good discriminant power, with C indices ranging from 0.727 (withdrawal symptoms) to 0.872 (legal problems) and a good calibration with H–L test p values ranging from 0.072 (alcohol use disorder) to 0.679 (interpersonal problems) (Table 1).

Table 1 Model discrimination power and calibration in the development sample and in the validation sample

The predictive values of the risk calculator were significantly and substantially better than those of models including the AUDIT-C global score as predictor, whatever the binary threshold used (differences in C indices ranging from 0.042 to 0.217) (eTable 6). The only exception was tremor or seizures; which predictive values did not significantly differ from those including the AUDIT-C global score with a threshold of ≥ 3 as predictor (Z (df) = 1.96 (3449); p = 0.050) for all patients.

The equations underlying the risk calculator for the 6 alcohol-related incident adverse outcomes examined at wave 2 are shown in Table 2 and Figs. 1 and 2. For example, our results indicate that the 3-year risk for interpersonal relationship problems for a 20-year-old (4 points) man (1 point) with no lifetime history of AUD drinking 2–4 times a month (2 points) with 3 or 4 drink per day (3 points) and who has 5 or more drinks in a single day at least once every month (7 points), thus for whom the total score is equal to 17 points (Table 2; Fig. 2), would have an absolute risk of 8.9% of developing interpersonal relationship problems in a 3-year period (Fig. 1), representing a 89-fold higher risk than male drinkers having the lowest alcohol consumption (Fig. 2). Additionally, our results indicate that only 8.3% of the US general population of adult drinkers would have a greater risk of developing interpersonal relationship problems than this particular individual (Fig. 2).

Table 2 Score calculation based on the 3 AUDIT-C items, age and sex for the prediction of the 3-year risk of incident alcohol-related adverse outcomes
Fig. 1
figure 1

Three-year risk of incident alcohol-related adverse outcomes according to individual scores

Fig. 2
figure 2

Risk calculator of the 3-year risk of incident alcohol-related adverse outcomes

Discussion

In a large, nationally representative sample of adults without a lifetime history of alcohol use disorder and consuming alcohol, we developed and validated a risk calculator for the 3-year risk of 6 alcohol-related medical, psychological, and social consequences, including incident alcohol use disorder, interpersonal relationship problems, withdrawal symptoms, legal problems, psychological problems, and tremors or seizures. The risk calculator was based on only 5 questions (i.e., past-year drinking frequency, average drinks consumed when drinking alcohol, frequency of heavy drinking, sex, and age) and allowed to quantify these risks within a timeframe likely to be relevant to most drinkers (i.e., 3 years). The calculator demonstrated good to very good predictive values (c indices ranging from 0.727 to 0.872) and calibrated well (all Hosmer and Lemeshow test p values ≥ 0.072) in the validation sample. The predictive values were significantly better than those of models including the AUDIT-C global score as predictor, whatever the binary threshold used, for all outcomes, except for “tremor or seizures”, and were well within the same range of other risk calculators for psychiatric disorders (Birmaher et al., 2018; Cannon et al., 2016; Fusar-Poli et al., 2017), whose C indices range from 0.71 to 0.79.

In line with prior studies (Begg et al., 2017; Dawson et al., 2012; Levola & Aalto, 2015; McCambridge, 2013), we found that the 3 dimensions of alcohol consumption assessed by the AUDIT-C (i.e., past-year drinking frequency, average drinks consumed when drinking alcohol, and frequency of heavy drinking) were independently associated with all adverse outcomes examined. The only exceptions were for heavy drinking frequency, which was not significantly associated with incident alcohol use disorder, and for the average number of drinks consumed, which was not associated with legal and psychological problems. This finding might be explained by a ceiling effect given the strong effects of both the drinking frequency and the number of drinks consumed when drinking alcohol on the risk of adverse alcohol-related consequences, and, potentially, by insufficient statistical power due to the relative limited number of incident adverse events in the 3-year follow-up period.

We found that younger age may increase risks of alcohol use disorder, interpersonal relationship problems, withdrawal symptoms, and legal problems beyond the effect of alcohol consumption. This result is in line with prior studies that suggest different magnitudes of effect of alcohol across different age groups (Begg et al., 2017; Denneson et al., 2011). Particularly, prior studies have shown that younger drinkers are more likely to have more adverse alcohol-related outcomes than older drinkers (Adams et al., 1990; Clemens et al., 2007; Molander et al., 2010). Interpersonal factors may negatively influence alcohol use at younger ages such as pacts with friends about drinking and celebration with influential peers, which are associated with heavier alcohol use and more negative alcohol-related consequences (Brister et al., 2010; Patrick et al., 2011).

Our study has several important implications. From a clinical perspective, this 5-question risk calculator is clinician-friendly and could be easily automatized or incorporated into the electronic medical record, or simply used by printing the charts attached in supplementary material translated in several languages (Annex 2). Given that age and sex are already systematically recorded, and that the AUDIT-C scoring is recommended by most clinical guidelines, the 5 items of the calculator could be available without any additional efforts from the clinician. The risk calculator may help promote early detection and treatment for high-risk individuals. By displaying the risks related to each pattern of consumption as well as the potential risk reduction that might be associated with alcohol consumption reduction, the calculator may provide an additional source of motivation for individuals with hazardous alcohol use, who often express a preference for reduction over abstinence (Aubin et al., 2015). From a public health perspective, an easy-to-use clinical tool may help policymakers design interventions in order to prevent alcohol use disorder and alcohol-related consequences in the general population (Birmaher et al., 2018; Cannon et al., 2016; Fusar-Poli et al., 2017).

This study has several limitations. First, alcohol consumption indicators and alcohol-related consequences were self-reported and may be subject to reporting and recall biases. Second, the risk equations have been validated in a randomly split subsample of the same database. Their validation in other general population samples, particularly in populations living outside the USA, is warranted to confirm their predictive values (Hoertel et al., 2014a, b, c). Finally, because we sought to build an easy-to-use clinical tool including only 5 items, our models do not capture other dimensions that can influence alcohol-related consequences, such as co-occurrence of other psychiatric disorders, access to mental health care, alcohol availability, stressful life events, and the protective role of social supports (Anderson & Baumberg, 2006; Glass et al., 1995).

In a large, nationally representative sample of drinkers, we developed an easy-to-use 5-question risk calculator with good to very good discrimination power to predict the 3-year risk of several important alcohol-related adverse outcomes, which can be calculated by hand using the attached chart (Fig. 2). To favor it use, we additionally translated it into 3 languages, i.e., French, Spanish, and Turkish (Appendix B). We hope that this risk calculator will be useful to identify, among adults without a lifetime history of alcohol use disorder and consuming alcohol, those at risk of developing alcohol-related adverse outcomes within a timeframe likely to be relevant to most drinkers (i.e., 3 years), encourage them to cut down their drinking and facilitate the implementation of focused preventive interventions in primary healthcare settings.