1 Introduction

The Covid-19 pandemic has challenged both scholars and policymakers from various points of view. In the first emergency, health management focused on containing the pandemic through non-pharmaceutical interventions (NPI from here on). These interventions were effective and decisive in preventing the contagions, but unsustainable in the long period. In parallel with health emergency management, research has focused on developing vaccines and treatments against Covid-19.

In particular, with the safety and efficacy results of the first vaccines, the organizational plan for the vaccination rollout has begun. It soon became clear that the outcome of the vaccination campaign depended not only on the stocks that each country could secure but also on the attitude of the population towards vaccination and the policies inserted into place to facilitate the campaign.

In various countries, part of the population was eager to get their vaccine shot, to avoid the most dangerous outcomes of Covid-19 and slow down the spread of the virus. Some other people were concerned about the fast development of many effective vaccines and refuse optional or compulsory vaccinations, stating that vaccines are not helpful but dangerous for children and adults. These arguments can be particularly persuasive in echo chambers with similar political and socio-demographic leanings, especially when they’re based on fake news or misinterpretations of scientific results. Unfortunately, this type of disinformation could undermine the effectiveness of vaccination efforts.

With the spread of more transmissible and pathogenic variants than the original strain of Covid-19 (Alpha, Delta, and most recently Omicron), the time factor has become even more critical in limiting the spread of the disease and lowering the most severe outcomes. The focus was on avoiding severe consequences for the susceptible population, the elderly, residents of nursing homes, and essential workers, such as healthcare personnel. In addition, with the broader availability of vaccines, compliance with the vaccination campaign has become a relevant theme of public health policy.

Many governments have promoted several initiatives to persuade those who are hesitant to receive the vaccination, including monetary incentives for vaccination, and limitations to public life. For instance, in several European states certification of vaccination was required to travel by plane or train, to go to a restaurant or gym, or even to work. The governments of Austria, Greece, and Italy implemented vaccination mandates in 2021, but they were restricted to the most vulnerable population segments. Other administrations use monetary incentives to foster vaccination: fixed-sum incentives (e.g.:New York City and Pennsylvania) or monetary lotteries.

This paper focuses on evaluating policies implemented by nineteen US states, which have promoted monetary incentives for vaccination, in the form of lotteries for those vaccinated against Covid-19. The first state to announce this type of policy was Ohio on May 12, 2021, launching the “Vax-a-million” initiative to combat low vaccination levels in the state. Ohio’s policy attracted the attention of policymakers in other states, who followed in the subsequent weeks the Ohio example, giving away monetary prizes to vaccinated. On July 21, 2021, in total eighteen states followed Ohio’s example. All except one publicize the policy by July 1, 2021. Even if, in principle, policymakers design monetary incentives to help the vaccination rollout, in this specific case, The results are uncertain beforehand. We could expect a positive effect from the monetary incentives, coherent with the literature ( Campos-Mercade et al. 2021). Nevertheless, Scepticism towards the safety of fast-developing vaccines, and efficacy doubts, can be enhanced by this kind of public intervention, harming the trust in government ( Latkin et al. 2021; Lazarus et al. 2021). Hesitant citizens may value avoiding the perceived risk connected to the vaccine more than the probability of winning a lottery prize, e.g.: (Sprengholz et al. 2021).

It is crucial to analyze the outcome of such policies and what drivers are more tightly related to their impacts. Facing this context, assessing the causal impact of nudging toward vaccines is not trivial and it might be heterogeneous with respect to a variety of socio-economic and behavioural factors (Dubé et al. 2015; Savoia et al. 2021; Quinn et al. 2016; Reiter et al. 2020).

Several papers have investigated the role of incentives in Covid-19 vaccination. Some of them used data from US states (Walkey et al. 2021; Barber and West 2021; AB 2021), but none of them, to the best of our knowledge, have investigated the county level, addressing in-states differences in vaccination rollout. While the effect of Ohio’s program has been studied, there are little to no comparative analyses between US states. There is theoretical ground to suspect that different characteristics correspond to different treatment outcomes.

We contribute to the literature by assessing the impact of conditional cash lotteries in a disaggregated framework in staggered adoption of the policy. We also study the duration of the effect, to exploit whether the treatment impact was temporary or persistent. We also investigate the heterogeneity of treatment effects across the counties. Our goal is to identify the socio-demographic characteristics of counties that performed better or worse. This work contributes to the methodological literature on the synthetic control method by providing estimates of weighted aggregate effects and proposing an inferential procedure for such effects.

We develop the paper as follows: the relevant literature and the context we wish to evaluate are presented in Sect. 2. We describe data collection in Sect. 3, and the causal inference approach in Sect. 4. Results are shown and discussed in Sects. 5 and 6 concludes.

2 Related literature

Vaccine hesitancy is a known issue in vaccination rollouts, even before the Covid-19 pandemic, as it was observed in vaccine rollout against measles, HPV, and seasonal influenza, see for a review Dubé et al. (2013).

Several health policy interventions over the years have been implemented in order to tackle the concerning trend of reduction of vaccination uptake among children, and in particular, to address parental vaccine hesitancy (Gowda and Dempsey 2013; Williams 2014).

In most previous studies, scholars have posed attention to those socio-economic drivers that can explain the variety in vaccination uptakes; see Jarrett et al. (2015) for a comprehensive review.

In particular, Robertson et al. (2021), Razai et al. (2021), Willis et al. (2021), Quinn et al. (2016), Reiter et al. (2020), focuses on the relationship between ethnicity and vaccination uptakes, Badr et al. (2021) and Azizi et al. (2017) shed light on the relation between poverty and unemployment and vaccines, interestingly, before and during the Covid-19 pandemic. Bertoncello et al. (2020) find an inverse correlation between the parental level of education and vaccine hesitancy and anti-vaccine sentiment, irrespective of whether children are involved. Malik et al. (2020), Marks (2020) and Joshi et al. (2021) investigated the socio-demographic composition of individuals willing to comply with the US Covid-19 vaccination campaign and find out significant differences in ethnicity, gender, and age groups. Dubé et al. (2014) highlighted another crucial aspect: vaccine hesitancy is not a fully generalized concept but has several different drivers across different countries.

The information source plays a role in determining the attitude towards vaccination: e.g.: Featherstone et al. (2019), Engin and Vezzoni (2020) and Mønsted and Lehmann (2022) find out that vaccine conspiracy belief spreads out on social media, especially among those who express conservative political thought. We can find similar results in Covid-19 vaccine rollout analyses about the US and UK, see for example (Loomba et al. 2021).

The effects of these different drivers were heterogeneous in the US. In some counties, the Center for Disease Control (CDC) considers the vaccination rollout concerning, especially in the Sunbelt and in the Great Plains, with possible negative effects also on the Covid-19 cases count and on the related deaths.

Interesting literature flourished among those incentives for vaccination and health policy interventions that should direct the general population towards health-policy goals, such as reducing smoking, obesity, and alcohol drinking. Gorin and Schmidt (2015) studies the relationship between the outcome of the policy and the public discussion generated. Persad and Emanuel (2021), Korn et al. (2020), Weisel et al. (2021), and Dotlic et al. (2021) argue about the legitimacy of such kind of monetary interventions in the case of Covid-19, considering the effects in terms of social responsibility. Campos-Mercade et al. (2021), Kim (2021), Jecker (2021) and Taber et al. (2021) discuss results coming from monetary incentives for Covid-19 vaccination, both in form of lotteries and fixed-sum transfers. We note that there is no agreement in the literature either on the legitimacy of monetary incentives for vaccination or on the actual results of such policies, since such incentives may not affect vaccination choices. Governments often used monetary incentives before restricting activities in the absence of a vaccination certificate.

3 Data

We collect information on 2925 counties located in 47 US states. We exclude Alaska, Hawaii, and Puerto Rico from the analysis because of their unique characteristics regarding the continental US. We exclude Texas because the primary outcome was not collected at the county level. We also exclude Missouri from the analysis, because of its late lottery announcement, on July 21,2021. Apart from these five states, we drop counties lacking observations. In particular, we excluded counties that do not report vaccinations at the end of the period, on August 24, 2021, and those with no consistent results for the cumulative number of vaccinations (e.g., decreasing cumulative vaccinations in some time periods).

The primary outcome of our analysis is the share of over 18 citizens vaccinated against Covid-19. The Center for Disease Control (CDC) provides this information on this variable. We chose this measure because it is the most responsive to incentives and it is measured when citizens adhere to the campaign. The percentage of the population fully vaccinated shows a delay between the adherence to the policy and the measurement of the outcome because of time elapsing between the first dose (adherence to the campaign) and the second dose. We are concentrating on the population aged 18 and over because, at the outset of the treatment, the authorization for vaccinating individuals aged 12 and above had only been granted relatively recently (May 10). As a result, there may have been hesitancy towards getting vaccinated, in addition to the vaccine hesitancy that has been measured by the CDC. The primary outcome is the only observed measure over time, from January 1st, 2021, to August 24th, 2021. We focus on this time period because after August 24th the Pfizer-Biontech vaccine received full approval from the Food & Drug Administration (FDA hereinafter). On the basis of this approval, the US government announced compulsory vaccination for military troops. We summarize daily data into weekly data calculating the 7-day moving average of the share of people vaccinated with the first dose. We prefer to work with this outcome rather than daily data because daily calendar effects or temporary delays in vaccination reports could harm the analysis.

Table 1 shows some descriptive statistics of the main outcome in three relevant time periods:

  • May 12: Announcement of the first lottery in Ohio.

  • Jul. 01: Announcement of the last lottery in Michigan.

  • Aug. 24: Compulsory vaccination for military staff, end of the observation period

Table 2 in the e-appendix shows the share of first-dose receivers for each state at these three time periods.

Table 1 Descriptive statistics for the percentage of first-dose receivers

Figure 1 represents the time series of the share of people vaccinated with the first dose.

Fig. 1
figure 1

% of first-dose receivers in the over 18 population - First vertical line: announcement of first lottery (Ohio), Second vertical line: announcement of last lottery (Michigan)

We enrich the dataset with information on various socio-demographic, political, economic, and environmental characteristics. In particular, our dataset includes information on the demographic composition, political orientation, and economic indicators (e.g.: percentage of unemployed and median income by county). The US Census Bureau collects and provides socio-demographic data for 2020. The CDC provides data on the percentage of people insured with Medicare. Economic indicators are derived from the work of Kirkegaard (2016), updated to 2020. The New York Times provide data regarding voting in the 2020 presidential election.

We also collect from CDC the total number of Covid-19 related deaths. This dimension could be a relevant effect modifier because the proportion of vaccinated might be higher in counties where happened relevant outbreaks. Table 2 shows some descriptive statistics for the variables used in the study.

Table 2 Descriptive statistics of socio-demographic characteristics of US counties

4 Methodology

4.1 Notation and Setting

We consider the lottery policy as a causal inference problem in which some counties receive the active treatment (the lottery) starting from some period \(t_0\), varying across different units, and other counties did not receive any monetary stimulus to take part in the vaccination rollout.

We consider a panel data setting, in which the total set \(\Omega\) of observed units consists in \(|\Omega |=2925\) US counties, observed for \(t \in T = \{0,\dots , t_{0}, \dots , t_{T}\}\), \(|T|=34\) from January 1, 2021, to August 24, 2021. Let \(\Omega ^1\) be the entire set of units in which the lotteries were active at some time t. Let \(\Omega ^0\) be the set of units that experiences no form of monetary incentives at any time t.

In total, \(|\Omega ^1| =1134\) counties have experienced a vaccine lottery in the period considered, and \(|\Omega ^0| = 1791\) counties have received no kind of monetary incentives for vaccination. Let \({\textbf {N}}^1=\{1, 2, \dots , {n^1}\} \subseteq \Omega ^1\) denote a generic subset of treated counties, and similarly let \({\textbf{N}}^0=\{n^1+1, n^1+2, \dots , n^1+n^0\} \subseteq \Omega ^0\) denote a generic subset of control counties. Finally, we denote as \(Y_{i,t}\) the primary observed outcome for a generic unit i by time t, which is the percentage of residents who received the first dose of vaccine.

4.1.1 Treatment uptake

Treatment was allocated at the state level, therefore counties belonging to the same state experienced the same treatment, in terms of duration and prize amount.

We denote the treatment indicator for unit i at time t by \(D_{i,t}\)

$$\begin{aligned} D_{i,t}= {\left\{ \begin{array}{ll} 1 \quad \text { if county }\, i \text { is receiving the treatment at time}\, t\\ 0 \quad \text {otherwise}\\ \end{array}\right. } \end{aligned}$$
(1)

Note that if \(D_{i,t}=1\), then \(D_{i,t'}=1\), if \(t'>t, t\in T\).

Following this specification, we can construct a treatment matrix D:

$$\begin{aligned} {\textbf {D}} = \begin{bmatrix} t &{} 1 &{} 2 &{} \dots &{} {n^1} &{} n^1+1 &{} n^1+2 &{} \dots &{} n^1+n^0\\ 0 &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} 0 &{} \dots &{} 0\\ 1 &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} 0 &{} \dots &{} 0\\ \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots \\ t_{0}^1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} \dots &{} 0\\ t_{0}^i &{} 1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} \dots &{} 0\\ \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots &{} \dots \\ {T} &{} 1 &{} 1 &{} 1 &{} 1 &{} 0 &{} 0 &{} \dots &{} 0 \\ \end{bmatrix} \end{aligned}$$

In the literature, this treatment framework is called staggered adoption ( Athey and Imbens 2021; Ben-Michael et al. 2022; Callaway and Sant’Anna 2020). In this scenario, the focus is on evaluating treatment effects started at different times. In particular, we suppose no unit receives the treatment before \(t_0^1\). So, \(t_{0}^1\) is the time in which the first unit receives the treatment. In our case, \(t_0^1\) is May 12, in which the governor of Ohio announced the “Vax-a-Million” initiative. Treated counties \(i \in n^{1}\) can receive treatment at any time \(t_0^{i}\ge t_0^1\). In particular, we define \(t_0^i\) as the period in which unit i receives the treatment. We found it useful to establish a notation for a unit that will or will not receive treatment at any time t: \({\textbf{D}}_{i}=1\) if \(D_{i,t} \ne 0\) for some t, and \({\textbf{D}}_{i}=0\) if \(D_{i,t}= 0 \quad \forall t\).

Fig. 2
figure 2

Month in which states announced the lottery program

Nineteen states have announced a vaccine lottery to improve the vaccine rollout (Ohio, Oregon, Washington, California, Nevada, New Mexico, Louisiana, North Carolina, West Virginia, Maine, Kentucky, Michigan, New York, Illinois, Missouri, Arkansas, Colorado, Delaware, Maryland). We report the beginning time of the lotteries and their duration in Figs. 2 and 3, see Table A1 in the e-appendix for further details.

Fig. 3
figure 3

Pre-treatment, treatment and post-treatment duration for each state, starting from January 1st. First vertical line: Ohio lottery announcement, Second vertical line: Michigan lottery announcement

4.1.2 Potential outcomes

We use a potential outcome approach to causal inference (Rubin 1974, 1978). For each unit i in each period t we define the following couple of potential outcomes:

$$\begin{aligned} {\left\{ \begin{array}{ll} Y_{i,t}(1) \equiv Y_{i,t}(D_{i,t}=1) \quad \text {as the potential outcome under treatment}\\ Y_{i,t}(0) \equiv Y_{i,t}(D_{i,t}=0) \quad \text {as the potential outcome under control} \end{array}\right. } \end{aligned}$$
(2)

This definition implicitly assumes the two following assumptions: the Stable Unit Treatment Value Assumption (SUTVA, Rubin 1980) and the no-anticipating treatment assumption.

Assumption 1

SUTVA, Rubin 1974

  • No hidden version of the treatment

  • No interference between units

Both aspects of SUTVA deserve some more commentary. The no-hidden version of the treatment component of SUTVA may be arguable in situations in which we consider treated counties belonging to different states. In such situations, both the award amount and the probability of winning may differ. We assume these differences are not relevant for determining whether the treatment causes an effect on the outcome of interest. This assumption appears to be credible because it is unlikely that the population applies a quantitative assessment of the cost-benefit ratio associated with vaccination in monetary terms. Consequently, all lotteries are similar for the receiving population. Taber et al. (2021) investigates this aspect, finding no differences in participation choices across twelve different lottery structures, supporting our assumption.

The non-interference assumption states that the potential outcome for any unit does not vary with the treatments assigned to other units. We believe that the non-interference assumption may be valid in our context. For a practical example, the lottery treatment of a county in Ohio should not change the vaccination campaign adherence of an untreated county, such as a county in Wisconsin.

The non-anticipating treatment assumptions require that if a county has not adopted yet the policy, the future adoption has no causal effect on potential outcomes for the current period. In theory, people could have changed their behaviour after the lottery announcement, delaying the vaccine administration to get the lottery ticket. In our study, this behaviour is not plausible because states allow taking part in the lottery even though they had already received the first dose. In addition, the short time between the announcement of the lottery and the start of the program does not allow for noticeable treatment anticipation phenomena. Finally, this policy spreads faster among the US, with most treated states announcing the program within 45 days after the original Ohio governor’s announcement.

Under SUTVA (assumption 1) and no-anticipating treatment, we have \(Y_{i,t}(0)\) for \(t \in \{0, t_0^1\}\) and \(Y_{i,t}(D_{i,t})\) for \(t \in \{t_0^1, t_T\}\).

4.2 Causal estimands

We introduce the following individual causal estimand.

$$\begin{aligned} \Delta _{i,t} = Y_{i,t}(1) - Y_{i,t}(0) \end{aligned}$$
(3)

for each county \(i \in {\textbf{N}}^{1}\) receiving the treatment by time t. For treated units \(Y_{i,t}(1)=Y_{i,t}\) and thus we compare the observed outcome \(Y_{i,t}(1)\) for the treated unit, with its counterfactual outcome \(Y_{i,t}(0)\).

We are also interested in causal effects for specific states \({\textbf{S}}\). We define it as a weighted average of the county effects \(\Delta _{i,t}\), multiplied by an appropriate weight \(\eta _i\). In our case, we chose \(\eta _i\) as the ratio of the population of the i-th county to the total population of the state \({\textbf{S}}\). Thus we define the treated set as \({\textbf{N}}^1 \equiv {\textbf{S}}\).

Formally:

$$\begin{aligned} \Delta _{{\textbf{S}},t}= \sum _{i \in {\textbf{S}}} \eta _i \Delta _{i,t} \qquad \eta _i \in (0,1) \end{aligned}$$
(4)

We can define causal effects at the state level (e.g.: pooling counties from California), but also at the supra-state level, by pooling counties from different states (e.g., pooling counties from West Coast). Therefore, the average effect on the supra-state aggregation will result as:

$$\begin{aligned} \Delta _{{\textbf{N}}^1,t}= \sum _{i \in {\textbf{N}}^1} \eta _i \Delta _{i,t} \qquad \eta _i \in (0,1) \end{aligned}$$
(5)

In this latter case, the treated set \({\textbf{N}}^1\), corresponds to the union of counties belonging to the states that we want to pool together.

Estimation of equation 5 at the supra-state level, in presence of staggered adoption, could be a little problematic. For example, consider three US states: Ohio, California and North Carolina. We observe three different periods for the treatment regime and the post-treatment regime. Table 3 reports their duration, in weeks. Naturally, we estimate individual treatment effects in every calendar time t, starting from the first week of 2021. Therefore, week 19 will be the first treatment period for Ohio, week 21 will be the first for California, and week 23 will be the first treatment period for North Carolina. To pool together treatment effects, under staggered adoption, we should sum them up with respect to the treatment assignment: to get the pooled effect for the first week after the treatment assignment we should calculate the estimand in equation 5 when \(t=t_{0}^i \quad \forall i \in {\textbf{N}}^1\). Consequently, to get the effect in the second treatment week we should sum treatment effects when \(t=t_{0}^i+1\) and so on. Note that \(\Delta _{{\textbf{N}}^1,t}\) is the weighted average of \(\Delta _{i,t}\) for each treated unit \(i \in {\textbf{N}}^1\). Therefore, it is defined only for time spells \(t_0^i - t \ge 0 \quad \forall i \in {\textbf{N}}^1\). In our example, we cannot have more than six weeks of treatment, which is the minimum amount of treatment common to the three states.

Figure 4 shows the re-alignment of periods regarding calendar criterion and regarding the treatment assignment.

Table 3 Duration in weeks of pre-treatment, treatment and post-treatment periods, for three example states, starting from the first week of 2021
Fig. 4
figure 4

Pre-Treatment and treatment period according to calendar weeks criterion and with respect to the treatment, periods within the vertical lines are the overlapping period for this set of states

Last, we define the average effect for the treated set \({\textbf{N}}^1\), over the period \((t_1,t_2)\), as

$$\begin{aligned} \Delta _{{\textbf{N}}^{1}}= \frac{1}{t_2-t_1}\sum _{t=t_{1}}^{t_2} \Delta _{{\textbf{N}}^{1},t} \end{aligned}$$
(6)

The choice of the period \((t_1,t_2)\) allows us to distinguish between the phase in which the lottery was active and the subsequent phase by averaging \(\Delta _{{\textbf{N}}^{1},t}\), during treatment, or post-treatment, periods. Note that equation 6 can be used to estimate both state or supra-state effects over time.

Post-treatment analysis helps us understand if the effect of the lottery is temporary or permanent. Indeed, it is likely that the lottery helps to convince latecomers to vaccinate. Thus, lotteries could speed up vaccination but not increase the number of vaccinated patients compared with controls at a more distant endpoint. Conversely, if we observed a permanent increase in the number of vaccinated, we could conclude that lotteries affect those who delayed vaccination and those who had no intention of vaccinating. Assessing the durability of the treatment effects over time may provide useful insights for policymakers.

4.3 Penalized SCM

This section explains how we impute missing potential outcomes.

With repeated observations over time, and many units both under treatment and under control, various tools are available to assess the effect of the policy.

We choose to estimate the causal quantity in equation 3 with a modification of the Synthetic Control Method, first introduced by Abadie et al. (2010), which is getting growing success in the causal inference community. We estimate causal effects by imputing potential outcomes, \(Y_{i,t}(0)\), namely the outcomes that the treated unit i would have been if it had never received the treatment, constructing a weighted mean of control units in the donor pool. This weighted average is called synthetic control. The weights are chosen so that the synthetic control for a treated unit i is very close to the treated unit during the pre-treatment period. This method was extended to allow the estimation of average treatment effects, also in staggered adoption contexts (Dube and Zipperer 2015; Donohue et al. 2019; Ben-Michael et al. 2022), where the focus is on estimating the treatment effects for each treated unit i and pooling them together.

Among the recent developments of the original estimator (see for example (Abadie 2021; Ben-Michael et al. 2021; Doudchenko and Imbens 2016)), we choose to adopt the novel method developed by Abadie and L’Hour (2021), the so-called Penalized Synthetic Control Method (P-SCM).

The penalized control method imputes the missing control outcomes \(Y_{i,t}(0)\) for \(i \in {\textbf{N}}^1\) as equation 7, where the weights are derived solving the

following minimization problem equation 8. For notation simplicity, let \(i=\{1,2,\dots ,n^1\}\) be the index for treated units and let be \(j=\{n^1+1,n^1+2,\dots ,n^0+n^1\}\) the index for control units. We define the synthetic control for each \(i \in \mathbf {N^1}\) as:

$$\begin{aligned} {\widehat{Y}}_{i,t}(0) = \sum _{j \in {\textbf{N}}^0} \omega ^{(i)}_{j} Y_{j,t} \qquad t \in (t_{0}^i, \ldots , T) \end{aligned}$$
(7)

For notation simplicity, let \(t_0=t_0^i\) and \(t_T^i=t_T\) for each unit \(i \in {\textbf{N}}^1\).

Let \({\textbf{X}}_{i}=\left[ Y_{i,0}, \dots , Y_{i,t_0} \right] '\) be a \(t_0-\)dimensional vector of pre-treatment outcomes, for each treated unit \(i \in {\textbf{N}}^1\), and let

\({\textbf{X}}_{j}=(\left[ Y_{n^1+1,0}, \dots , Y_{n^1+1,t_0} \right] ', \dots , \left[ Y_{j,0}, \dots , Y_{j,t_0} \right] ', \dots , \left[ Y_{n^1+n^0,0}, \dots , Y_{n^1+n^0,t_0} \right] ')\)

be a \((t_0) \times |{\textbf{N}}^0|-\)dimensional matrix of pre-treatment outcomes, for the set of control units \(j \in {\textbf{N}}^0\)

Given a positive penalization constant \(\lambda ^{(i)}\), \(i \in n^{1}\), the set of weights

$$\begin{aligned} \varvec{\omega }^{(i)} = \{\omega ^{(i)}_{j}\}_{j \in {\textbf{N}}^0} \end{aligned}$$

defines the penalized synthetic control unit for treated unit i. The set of weights \(\varvec{\omega }^{(i)}\) is obtained by solving the following minimization problem.

$$\begin{aligned} \arg \min _{\varvec{\omega }^{(i)} \in \varvec{W}} \left\| {\textbf{X}}_{i} - \sum _{j \in {\textbf{N}}^0}{\textbf{X}}_{j}\omega _{j}^{(i)}\right\| ^2 + \lambda ^{(i)} \sum _{j\in {\textbf{N}}^0} \left\| {\textbf{X}}_{i} - {\textbf{X}}_{j} \right\| ^2 \end{aligned}$$
(8)

subject to

$$\begin{aligned} \omega ^{(i)}_{j}\ge 0 \quad \forall j \in {\textbf{N}}^0;\qquad \sum _{j \in {\textbf{N}}^0} \omega ^{(i)}_{j}=1, \end{aligned}$$

with \(\Vert n \Vert\) is the \(L^2-\)norm: \(\Vert \textbf{v} \Vert = \sqrt{\textbf{v}' \textbf{v}}\) for \(\textbf{v} \in {\mathbb {R}}^r\)

We choose the P-SCM estimator because it is specifically designed for estimating average treatment effects with disaggregated treated units. Moreover, it grants us the uniqueness of the weights. Abadie and L’Hour (2021) and Abadie (2021) present the Technical details of the estimator and its use. As discussed in Athey et al. (2018), in observational studies where both N and T are large, the choice of which method to use is not straightforward, and P-SCM may allow a data-driven solution to this problem. In fact, when \(\lambda ^{(i)} \rightarrow 0\), the P-SCM collapses into the standard SCM, while when \(\lambda ^{(i)} \rightarrow \infty\) the P-SCM is equivalent to the nearest-neighbour matching estimator. P-SCM enables us to maintain an agnostic behaviour towards the choice between matching methods and SCM.

We chose the tuning parameter \(\lambda ^{(i)}\) by using the weighted cross-validation approach, proposed by Abadie and L’Hour (2021). We define the function \(\Phi (\lambda )\), exposed in Eq. 10, which minimizes the overall Root Mean Square Prediction Error (RMSPE, defined in equation 9) of the donor pool for a given of \(\lambda\) in the treatment period.

$$\begin{aligned} \Re (\lambda , t_T-t_{0} )=\sqrt{\dfrac{1}{(t_T-t_{0} )}\sum _{t={t_0}}^{t_T} \left[ Y_{j,t}-{\widehat{Y}}_{j,t}(\lambda )\right] ^2} \end{aligned}$$
(9)

The tuning parameter \(\lambda ^*\) is chosen as follows:

$$\lambda ^{*} = \arg \min _{{\lambda \in \Lambda }} \Phi (\lambda ) = \sum\limits_{{j = 1}}^{{n^{0} }} {\Re \left( {\lambda ,t_{T} - t_{0} } \right)}$$
(10)

Table A4 of the e-appendix reports values of lambda used in the estimation.

4.3.1 Donor pool definition

(Abadie 2021) and Abadie et al. (2015) pointed out the necessity of a proper donor pool to estimate synthetic control weights. In particular, they suggest to subset the total control set \(\Omega ^0\) in a set that has similar characteristics regarding the treated set. Usually, donor pool selection is driven by the researcher’s experience of the characteristics of the treated units (e.g.: Abadie et al. (2015)).

In this work, we search for the proper donor pool for each treated set \({\textbf{N}}^1\) by using a data-driven procedure. We choose to select the donor pool by using a matching approach based on propensity scores, first introduced by Rosenbaum and Rubin (1983). Let

$$\begin{aligned} \pi _i=\pi _i({\textbf{C}})=Pr({\textbf{D}}_i=1|{\textbf{C}}) \end{aligned}$$
(11)

denote the propensity score, where \(\textbf{C}\) is a set of pre-treatment covariates.

We estimate the propensity score using a logistic regression model on a set of covariates describing socio-economic, political, ethnic, and demographic characteristics (see Table 2 for the complete list). Once we have estimated \(\pi =\{\pi _j\}_{j \in {\textbf{N}}^0}\) for the control units, we select the donor pool by performing a one-to-many matching, with 4 control units for each treated unit. Thus following, \({\textbf{N}}^0\) denotes the subset of matched control units. Table A3 in the e-appendix shows two-sample t-tests to evaluate the similarity between covariates of the treated pool and donor pool for each state. Such a procedure for donor pool restriction speeds up weight computations. Moreover, it allows to build the synthetic control of the treated unit from similar control units, without relevant losses in terms of prediction error, see also (Abadie et al. 2010).

4.3.2 Inference

We conduct inference using falsification tests, also named “placebo studies,” which can be viewed as a type of randomization inference.

A placebo test is constructed by reassigning the treatment to control units in the donor pool and estimating the so-called placebo effects. Then, the estimated treatment effect is compared with the distribution of placebo treatment effects. A treatment effect is considered statistically significant when the magnitude of the effect is large with respect to the distribution of the placebo effects. See (Abadie et al. 2010) and (Abadie et al. 2015) for details. (Firpo and Possebom 2018) extend the original inference to allow for different weights for placebo units, but pointed out that in real applications it is difficult to find these weights.

Building on the aforementioned works and the work by Cavallo et al. (2013), we use an algorithm to conduct inference with SCM and aggregate units.

We focus on assessing the significance of the aggregate \(\Delta _{{\textbf{N}}^1}\) effects. Therefore, we do not estimate p-values or confidence intervals for individual effects for each i-th county. We neither estimate confidence intervals for each time-specific effect \(\Delta _{{\textbf{N}}^1,t}\). We define the test statistics as

$$\begin{aligned} \Theta =\frac{{\Re }(t_{0},t_T)}{\Re (0,t_{0})} \end{aligned}$$
(12)

where \(\Re (t_1,t_2)\) is the root-mean square prediction error of the estimated \(\Delta _{{\textbf{N}}^1}\) calculated between \(t_1\) and \(t_2\), as in equation 6.

Let \(\Delta _{{\textbf{N}}^0}=\{\Delta _1, \dots , \Delta _j, \dots , \Delta _{J}\}\) be the set of aggregate placebo effects, created by randomly sampling placebo effects from units in \({\textbf{N}}^0\) and pooling them together to find each aggregate placebo effect \(\Delta _j\). In our analysis, we choose \(J=1000\) and we pooled \(|{\textbf{N}}_1|\) control units for each placebo state, where \(|{\textbf {A}}|\) is the cardinality of set \({\textbf {A}}\). Let \(\Theta _{{\textbf{N}}^0}=\{\Theta _1, \dots , \Theta _{j}, \dots , \Theta _{J}\}\) be the set of placebo \(\Theta _j\) calculated using equation 12 on the set of placebo effects \(\Delta _{{\textbf{N}}^0}\).

We can define the p-value as:

$$\begin{aligned} \rho _{{\textbf{N}}^1}=\frac{1}{J}{\sum _{j \in J}} \pmb {\mathbb {I}}(\Theta _{j}>\Theta _{{\textbf{N}}^1}) \end{aligned}$$
(13)

In principle, if the effect \(\Delta _{{\textbf{N}}^1}\) is large relative to the distribution of placebo effects \(\Delta _{{\textbf{N}}^0}\), there should be very few \(\Theta _{J}>\Theta _{{\textbf{N}}^1}\), and we can consider the effect statistically significant. As stated by Abadie et al. (2010), using \(\Theta\) as test statistics instead of \(\Delta\) allows us to compare treatment units with control units, even in presence of imperfect pre-treatment fit. Algorithm 1 shows the procedure to calculate \(\rho _{{\textbf{N}}^1}\).

figure a

4.4 Assessing treatment heterogeneity

After the first vaccine lottery announcement in Ohio, several states introduced their own lottery, creating policy mimicking. There is some consensus on the positive results of conditional cash lotteries in Ohio (Barber and West 2021; Acharya and Dhakal 2021); but very early evaluations from other states (e.g.: Arkansas) head in different directions, leading the policymakers to interrupt the policy.

These results suggest that treatment effects could have been heterogeneous both across states and across counties. We study treatment effect heterogeneity in order to provide helpful insights for future decisions.

We classified treated counties according to some socio-demographic, economic, and cultural characteristics. Table 1 shows the dimensions used in this analysis. To classify treated counties, we follow a clustering approach with Gaussian mixture models (see Fraley and Raftery 2002; McLachlan et al. 2019). We choose the number of clusters that minimize the selection model’s BIC for membership in each group of counties. According to the set of covariates \({\textbf{C}}\), we found six clusters of counties. Table 4 reports the mean values of the covariates in each cluster.

Table 4 Mean value of the covariates in each cluster

As one can see in Table 4, clusters 3 and 5 have a very different ethnic composition compared to the others. Specifically, in cluster 5, the share of the Afro-American population is the highest among cluster, and cluster 3 represents the one with the highest percentage of Hispanics. Cluster 2 has the richest, and the most educated population on average and the lowest proportion of people voting for the Republican Party. Clusters 1 and 6 are distinguished by their high percentage of votes for Republicans. Cluster 4 has the highest median age, and the highest proportion of citizens ensured with Medicare.

5 Results

5.1 Causal effects

This section will present the results of our matching+PSCM approach to estimate causal effects.

Figure 5 shows county-level average treatment effects on the treated (ATT) during the vaccination lottery and after the vaccination lottery. During the treatment period, we find positive effects, in the northwest area (Oregon, Washington, and some areas of California), in the Midwest (Ohio, Kentucky, Illinois, and West Virginia), and on the East Coast (New York, Maryland). In the other areas, the effects of the lotteries are minor, and we find small or adverse effects, in the Southern states. County treatment effects appear to be heterogeneous within the same state and between different states, we focus on this analysis in Sect. 5.3.

Fig. 5
figure 5

Intertemporal effect \(\Delta _{{\textbf{S}},t}\) - solid line: estimated effect. First vertical line: lottery announcement for the treated state, second vertical line: end of lottery

We now focus on causal effects at state-level, estimated using equation 6. Table 5 shows treatment effects at state-level, and their p-values, while figure 6 shows the time series of the treatment effects at the state level.

Fig. 6
figure 6

Boxplots of county-level effects \(\Delta _i\), calculated according equation 6, classified by cluster membership

Table 5 State-level treatment effects on the share of over-18 vaccinated population

We can evaluate the goodness of fit from synthetic control estimates using pre-treatment RMSPE. In addition, we use visual inspection of the pre-treatment difference between treated and synthetic control as in Abadie et al. (2010). Colorado and New Mexico exhibit poor pre-treatment fit, and thus we do not report results from these two states. The inadequate pre-treatment fitting results could be due to inconsistencies in the weekly reporting of vaccinations. All the other states have an RMSPE lower than 1 percentage point, which suggests a reasonable fit. We find statistically significant positive impacts on vaccination rollout for West Virginia, Oregon and Washington, and positive effects, but not statistically significant, for Ohio, New York, Maine, Illinois and Maryland. The effects for the other states are small, or even negative for Delaware, Arkansas, and Louisiana but none of the results is statistically significant.

In the post-treatment period, however, the size of the effects tends to decrease in all states and estimates are not statistically significant for any state. Therefore, there is some evidence that the effect of the policy was temporary, and, after the end of the lotteries, no long-term effect is estimated. Note that we cannot evaluate post-treatment effects from Kentucky, Nevada, New Mexico, Illinois and Michigan because lotteries ended after August 24th, 2021.

We also conduct a macro-region analysis, considering spatial aggregates of states. We focus on four US macro-regions, estimating ATT in each of them:

  • East Coast (New York, Maryland, Delaware, Maine)

  • Southern (Louisiana, North Carolina, Arkansas)

  • Midwest (Ohio, Illinois, Kentucky, Michigan, West Virginia)

  • West Coast (California, Oregon, Washington)

In this macro-regions analysis, we evaluate only the effect for the treatment period, as some of the post-treatment effects are missing due to the end of the observation period.

Table 6 shows our findings. Results are consistent with those we find at the county level. Midwest states have benefited more from the policy, with positive and significant results. This group comprises states that join the policy either early (Ohio), or fairly late (West Virginia, Illinois, Michigan).

We find some positive results, although not statistically significant, in the East Coast area, and in the West Coast.

Table 6 Effects on the share of over-18 vaccinated citizenship for US macro-regions

It is worth noting that West Coast area comprises Washington and Oregon, in which the policy has positive and statistically significant results, and California, in which small or even negative effects were estimated. Negative, but not statistically significant results come from Southern states of the US, as seen in Louisiana and Arkansas ATT.

5.2 Staggered adoption results

In addition to causal effects for each state, and for major macro-regions, we assess aggregate effects for states that began vaccine lotteries simultaneously, or within the same short time interval.

This analysis allows us to evaluate whether the early introduction of the policy gives some novelty effect. If so, we expect that the states who adopt the policy at early stages have better outcomes than the states adopting the policy later on. The considerable media buzz coming after the Ohio announcement could, at least partially, explain the heterogeneity of the effects between early and late policy adoption.

We consider the following groups of states:

  • Early Bird states: adoption within May 20 (Ohio, New York, Oregon, Delaware)

  • Second Echelon states: adoption by the end of May (Maryland, California, Arkansas)

  • Third Echelon states: adoption within mid-June (Washington, Kentucky, North Carolina)

  • Latecomers states: adoption by the end of June (Louisiana, Nevada, Maine, Illinois, West Virginia, MichiganFootnote 1)

We excluded Colorado and New Mexico counties from this analysis because of the unreliability of their pre-treatment fit.

Table 7 Effects on the share of over-18 vaccinated citizenship, according to timing announcement of the lottery

Table 7 shows the ATT and p-values for the four groups of states, classified according to the timing of policy announcements. Figure A1 in the e-appendix reports the time series of their treatment effects.

For Early Bird states and Third Echelon states the policy seems to have induced an additional 1% of the population vaccinated against Covid-19, and these results are statistically significant. Small and statistically negligible results emerge for Second Echelon states and for Latecomers.

Results suggest that there may have been a novelty effect, especially for Early Bird states. However, even if the June effect results are smaller than those in May, the positive sign of the effect should suggest an intrinsic positive effect from the policy, untied from the media attention. In general, multi-level analysis of individual and pooled treatment effects suggest that higher level of aggregation for treatment effects grants better fit for estimation of missing quantities, table A5 in the e-appendix shows the mean RMSPE and size of treatment group for each level of analysis.

5.3 Treatment heterogeneity analysis

In section 4.4, we have identified six clusters of treated counties, according to socio-economic characteristics. We now estimate ATT within those six clusters.

Figure 7 shows the boxplots for treatment and post-treatment effects in each cluster. We estimate positive effects in mean for clusters 1, 2, and 6, which account for about 65% of the treated units. Specifically, clusters 1 and 6, markedly Republican, showed promising results from the policy, in particular, post-treatment evaluation for cluster 6 remains positive, but with considerable variability. We got interesting results in cluster 2, which was wealthy, democratic and more educated on average, without persistence in the post-treatment. We got negative or no results in clusters 3 and 5 the two clusters with higher ethnic components, these clusters accounting for 20% of the total share. It seems that counties with higher shares of Hispanic and Afro-American citizens and lower wealth have worse policy outcomes. This result is consistent with previous findings in the literature, which found a link between ethnicity and vaccine hesitancy (Quinn et al. 2016; Reiter et al. 2020). This hesitancy appears to be unaffected by monetary incentives, in contrast to the idea that a financial incentive can alter the choices of the less wealthy population.

Generally speaking, we observe high variability within each cluster. Therefore, the background characteristics may not be enough to explain the heterogeneity of treatment effects.

Fig. 7
figure 7

Average Treatment Effect \(\Delta _i\) during treatment and post treatment, represented at county level, estimated according equation 6

6 Conclusions

Policies to foster vaccination through monetary transfers and lotteries have been discussed in the literature, which finds potential benefits in the face of ethical and equity issues, see for example (Jecker 2021; Dotlic et al. 2021; Kim 2021; Sprengholz et al. 2021). This study aims to explain the effects of incentives in the treated states, both at different levels of analysis (county, state, macro-region) and at different starting dates, providing an overall picture of policy outcomes. In particular, no previous study has focused on counties and the unique characteristics of each unit. We can see conditional cash lotteries as a viable health policy to entice people to get a vaccine jab. This is relevant, especially when the vaccination timing is a crucial variable in determining the outcome of the vaccination rollout. However, our results suggest that the outcome of such policies is not unidirectional. In particular, we find different results across counties belonging to the same state and between counties belonging to different states.

Washington, Oregon and West Virginia outperformed other states getting positive and significant results. Other states got positive, yet not statistically significant, effects (Ohio, New York and Illinois), while the remaining states get no to negative results.

The analysis of staggered adoption effects shows that causal effects are higher in states that adopted the policy early (Early birds states). Media-buzz effects around the policy can explain these findings, as hypothesized by Gorin and Schmidt (2015). Nevertheless, policies can still produce positive results, even in the case of late adoption, as in the case of West Virginia. Cross-bordering effects are also present. In particular, treated counties in the Midwest of the US seem to have benefited most compared with counties in three other macro-region (East Coast, West Coast and Southern US). The study of the effects after the lottery ending is also unprecedented. This analysis is relevant as it differentiates between counties and states that could have experienced permanent effects from the policy or only temporary ones. This dimension is decisive for the policymaker, adopting this incentive if they should achieve policy goals fast. We observe temporary positive effects in most of the treated counties, especially for the counties in the Midwest and Northwestern areas. However, we got lower effects in the Sunbelt, where monetary incentives did not convince people who are averse to the vaccine.

After initial success, maybe backed up by the media attention, the vaccine lotteries did not significantly affect the vaccination choices of Americans. Small and statistically negligible post-treatment results suggest that lotteries have induced people that would have been vaccinated at some point to anticipate the jab. Nonetheless, even short-term temporary impacts, such as the ones we find for US lotteries, may be meaningful when the health policy goal has to be achieved within a short time.