1 Introduction

Children of younger mothers have poorer outcomes for health, education, and employment than children of older mothers (e.g., Hoffman & Maynard, 2008). However, maternal age at childbirth correlates strongly with a range of other maternal characteristics, and confounding factors are a concern for any causal analysis. Furthermore, natural experiments and comparison groups are difficult to find. A large literature considers the impact of maternal age and teenage motherhood, in particular on the mothers’ own outcomes. Broadly, this literature follows two strategies: using sisters as a comparison group (Geronimus et al., 1994, Johansen et al., 2020a) or using women who miscarry and delay childbearing as a comparison group (Hotz et al., 2005, Miller, 2011, Ashcraft et al., 2013, Gorry, 2019, Markussen & Strøm, 2022). Although the evidence is mixed, the causal effects are nowhere near as strong as the raw correlations between teenage motherhood and maternal outcomes. We build on the literature on the mother’s own outcomes and study the impact of teenage motherhood on the children.

Our contribution is twofold. Firstly, we add to a small and growing literature on the impacts on children of teenage mothers. Secondly, we exploit unique registry-based data to compare the identifying assumptions and results from two different data-demanding empirical strategies: a cousin strategy (comparing maternal cousins) and a miscarriage strategy (comparing teenage mothers and mothers who miscarried in their first, teenage pregnancy). Our analyses stress similarities and differences across the two strategies in one context and one time period, and thus contribute to an understanding of what drives the results in the literature.

One strand of literature has explicitly examined the effects of teenage motherhood on child outcomes. This literature typically accounts for adverse maternal selection by exploiting within-sibling and within-cousin variation in mothers’ age at childbirth. An early example is Francesconi (2008), who compares outcomes of British siblings born to a teenage mother versus outcomes of younger siblings born when the mother was older and finds significant differences in adult outcomes. Similarly, Perez-Alvarez and Favara (2023) exploit within-sibling variation for India and find large effects that weaken over time but stay statistically significant. They interpret this as evidence that the institutional context and social safety net in low- versus high-income countries matter. Aizer et al. (2022) instead use within-cousin variation. They find that the adverse relationship between teenage motherhood and children’s long-term outcomes declines substantially when including maternal grandparent fixed effects.Footnote 1 This strategy assumes that it is random which of a pair of sisters becomes a teenage mother, conditional on control variables, and hinges on rich control variables being available (Holmlund, 2005). Aizer et al. (2022) quantify the selection on unobservables in the case where the true effects are zero and conclude that “negative selection into teen motherhood explains much but probably not all of the worse outcomes observed for their offspring”. Basu and Gorry (2021) adds to the literature by using miscarriages in a US context to study the impact of teenage motherhood on child health. In contrast to most of the fixed effects studies they find no negative effects. We contribute to this strand of literature by using both empirical strategies in the same period, the same institutional context and with rich data. In this way, we offer insights into why the fixed-effect strategy find negative effects, while the miscarriage strategy does not.

Another strand of literature has studied the effects of maternal age at first birth on child outcomes more generally, with no particular focus on teenage mothers. This literature accounts for adverse maternal selection by exploiting exogenous variation in the timing of the first birth. Miller (2009) uses three types of biological fertility shocksFootnote 2,Footnote 3 as sources of exogenous variation and finds a positive effect of higher maternal age on first child’s cognitive outcome. Fredriksson et al. (2022) exploit plausibly exogenous variation from school starting age rules, which translates into variation in age at first birth. They find that increasing age at first birth reduces birth weight and gestation length, but no effects remain in the long run. This strand of literature focuses on maternal age in general, and therefore results are naturally dominated by mothers giving birth in their late twenties. We rely on similar ideas in terms of identifying variation, but with a focus on teenage mothers. In addition, our study reconciles the two literatures by explicitly exploring the importance of maternal age, i.e., length of delay, for the results.

There are several reasons why delaying childbirth in teenagers may benefit children. It gives the mother time to increase education and employment opportunities and thus improves the economic circumstances of the child. Furthermore, the mother has time to mature, improve parenting skills, and prepare a stable family unit. However, there are also reasons why earlier childbirth may be beneficial to children. A young mother naturally has younger parents who may play a larger grandparent role, and perhaps a young mother has more energy to raise children or she can better relate to young children. These mechanisms are at play across empirical strategies, but finding the exact causal chain is beyond the scope of this paper.

We use rich Danish registry data with information on three generations: the teenagers, their parents, and their children, combined with information on socio-demographic characteristics and pre-pregnancy behavior of the teenage mother, including use of mental health care and risky sexual behavior. This allows us to map out the selection into motherhood along a range of dimensions and to investigate the differences between the empirical strategies in detail. First, we examine the association between teenage motherhood and child outcomes, and then we address selection into teenage motherhood by both comparing cousins’ outcomes and exploiting miscarriages as a natural experiment inducing some mothers to postpone childbirth. Furthermore, we examine the counterfactual in terms of length of childbirth delay.

We find a strong association between teenage motherhood and child outcomes, which does not disappear after adjusting for background characteristics. However, when we apply the cousin strategy and miscarriage strategy, we find no or limited effects of teenage motherhood on children’s health and educational outcomes. And in particular, when we use women delaying motherhood to their early twenties—up to around 22 years—as a counterfactual for teenage mothers, we show suggestive evidence that the effects of such delays are nil across outcomes for both strategies.

We structure the remainder of the paper as follows. Section 2 describes the data, Section 3 presents the empirical strategies, and Section 4 presents the main results. Section 5 concludes.

2 Data

This paper uses individual-level administrative Danish data from health and public school registries merged with socio-demographic information. A unique personal identifier links individuals across registries, and family identifiers link children to mothers and fathers as well as cohabiting partners. In this section, we describe the sources of the data and the selection into the timing of teenage motherhood.

Figure 1 presents plots of simple descriptive associations between the timing of motherhood and children’s outcomes for all children born in Denmark. These plots show a strong association between the timing of motherhood and children’s outcomes. The left-hand side of Fig. 1 plots the proportion of children with a low birth weight (<2500 g) by maternal age at birth for Danish children, and the right-hand side of Fig. 1 shows a corresponding plot for average reading test scores in grade 2. The figure shows that the fraction of low birth weight babies born to 17-year-old mothers is 2 percentage points higher than for babies born to 30-year-old mothers (7% vs. 5%), whereas grade 2 reading scores are more than half a standard deviation (SD) lower. The latter gap compares to the gap between children of mothers with master’s/PhD degrees versus mothers with no more than high school (see Beuchert & Nandrup, 2018).

Fig. 1
figure 1

Outcomes for children by mother’s age at childbirth. Note. For birth weight, the sample includes all children born in Denmark from 1990 to 2017. For reading test scores, the sample includes all children who took the national tests in reading, grade 2, in 2010 and 2011

In this paper, we focus on first-born, singleton children born in Denmark from year 2000 with mothers who are no more than 26 years old at the time of childbirth. Age 26 corresponds to the points in Fig. 1, where the relationship between child outcomes and mother’s age flattens. Thus, this paper focuses on the potential disadvantage in child outcomes associated with having a younger mother. The specific age limit of 26 is chosen to ensure a balanced sample across motherhood timing, given the data restrictions. We later document that the potential bias due to the censoring at 26 is likely to be small in this context.

Information on abortions and miscarriages is available from 1994 and information on births is available until 2017.Footnote 4 The information comes from two sources. First, The National Patient Registry (NPR, 1994–2017) and The National Health Insurance Service Registry (HISR, 1990–2018) provide high quality information on miscarriages and abortions. Specifically, the NPR includes information on hospital contacts (e.g., date and diagnosis) and the HISR includes information on contacts with general practitioners and specialists (weekly services provided). Second, The Birth Registry provides information on the exact date of birth for the child and we link this to the mother’s exact date of birth to find the mother’s age at the birth of the child. Figure A.1 illustrates the sample selection.

2.1 Background characteristics

Information on background characteristics comes from population, education, employment, and health registries. Table 1 shows selected background characteristics for three different samples of women who had all become mothers by age 26. The first column includes information on the full sample of women who had become mothers by age 26. The second column includes information on the subsample of mothers who were pregnant at some point during their teenage years. The teenage pregnancies end with either a childbirth, a miscarriage, or an abortion. In our sample, 59% of the teenage pregnant mothers actually become teenage mothers (third column). We define teenage mothers as mothers who give birth by age 19 + 9 months (corresponding approximately to conception while the mother was a teenager). The vast majority of teenage mothers give birth in their first pregnancy, but some women miscarry or have an abortion and still have a child shortly after in a subsequent pregnancy while still in their teens. The remaining part of the sample miscarry or have an abortion in their first pregnancy and have a child at age 20–26.Footnote 5

Table 1 Selected background characteristics for different groups of mothers

Table 1 shows that teenage pregnant mothers are a negative selection of all mothers—both in terms of behavior and socio-economic characteristics. For instance, the likelihood of having started high school at or before age 17 is nearly half for teenage pregnant women compared to the full sample.Footnote 6 Furthermore, the mental health of teenage pregnant women is also poorer. The third column shows that those who become teenage mothers constitute an even more negatively selected sample. The negative selection into teenage motherhood extends to the children’s fathers: In the full sample, 24.4% of children have a father who started high school before or at age 17, whereas this is true for only 13.2% of children with a teenage pregnant mother and 9.7% of children with a teenage mother. Similar differences exist for the paternal grandfathers’ education.

2.2 Child outcomes

We study the effects of being born to a teenage mother on early child health and education. We define four outcomes: birth weight, injuries age 0-5, and test scores in reading and math.

2.2.1 Birth weight

We use information on birth weight in grams from The Birth Registry. Birth weight is closely tied to short and long-run outcomes, including IQ (Black et al., 2007).

2.2.2 Early injuries

We use information on injuries from NPR (ICD-10-DCR codes S00-S99 or T08–T14, primary and secondary diagnoses). We do not exclude injuries due to self-harm. Injury is the dominant cause of death in children after age 1 and emergency ward visits represent a sizeable cost to society (Denmark: REHPA, 2017; US: Currie & Hotz, 2004). Early injuries reflect both the quality of care and the safety of the environment (Currie & Hotz, 2004), while at the same time reflecting risky health behavior in children and parents associated with symptoms consistent with Attention-Deficit/Hyperactivity-Disorder (ADHD) (Dalsgaard et al., 2015; Wimberley et al., 2022). Our main outcome is an indicator variable for whether the child has had an injury by age 5.

2.2.3 Test scores

We use test scores in reading in grade 2 and math in grade 3. Public school pupils have been tested in reading (in grades 2, 4, 6, and 8) and math (in grades 3, 6, and recently also grade 8) since 2010. The tests estimate the student’s ability in three cognitive areas of each subject. For reading, the cognitive areas are language comprehension, decoding, and reading comprehension. For mathematics, the cognitive areas are numbers and algebra, geometry, and applied mathematics. The tests are IT-based, self-scoring, and adaptive.Footnote 7Footnote 8 Principals may exempt some students from the tests.

We first standardize the ability measures in the population within year, grade, subject, and cognitive area (mean 0, SD 1); we then sum the standardized measures for the three cognitive areas in each subject; and, finally, we standardize the final measures in the population (mean 0, SD 1). This is our measure of average student ability scores.

Figure 2 shows the relationship between birth weight, injuries, reading and math test scores and maternal age at childbirth for the sample of all mothers (left) and teenage pregnant mothers (right). Furthermore, the figure examines the extent to which observable characteristics explain the trends in Fig. 1. Formally, the figures show coefficients from regressions of an outcome on indicator variables for a mother’s age at childbirth, with teenage mothers as the omitted category.Footnote 9 Thus, the coefficients reflect the difference in child outcomes by maternal age at childbirth compared to a child with a teenage mother.

Fig. 2
figure 2

Child outcomes by maternal age at childbirth. Note. Dots show the coefficient from a regression of the outcome on indicators for age at first childbirth (reference group is age <20 years). Blue and red dots refer to separate regressions, red dots reflecting coefficients from a regression including control variables as listed in Table A.1 and indicators for missing control variables. The full sample includes singleton, firstborn children born from 2000 to 2017 with a mother born 1974–2005, who was no more than 26 years old at childbirth. The teenage pregnant sample includes mothers with their first pregnancy as a teenager ending in either birth, miscarriage, or abortion. Vertical lines show 95% confidence bands

In the figure for all mothers, the plots for the estimations without covariates (left-hand side, blue dots) recover the pattern from Fig. 1 for the sample used in our analyses. Test scores show a gap of around 0.5 SD between the test score of a child with a teenage mother and a child with a 26-year-old mother. As maternal age approaches teenagehood, the test score gap is gradually reduced and is zero at maternal age 20. Birth weight shows a gap of 100 grams for a 26-year-old mother compared to a teenage mother. Again, the gap is reduced as maternal age approaches teenagehood. This pattern is consistent with Fig. 1, despite showing birth weight and not the probability of a low birth weight. A similar pattern is seen for experiencing an injury by age 5. When we include covariates in the estimations (left-hand side, red dots) (see Table 1 and A.1), we eliminate around two-thirds of the negative association. However, a significant—and substantial in the case of test scores—gap remains. The right-hand figures address selection for teenage pregnant mothers and here non-teenage mothers are those who did not have a child in their first pregnancy—due to either a miscarriage or an abortion. Comparing the raw correlations without covariates (right-hand side, blue dots) to the raw correlations for all mothers (left-hand side, blue dots) shows that narrowing the sample of mothers reduces the negative association by about a third. One exception is injury by age 5, where there is no clear association with mother’s age at childbirth in the sample of teenage pregnant mothers. For the remaining outcomes, adding covariates reduces the associations to about the same level as the sample for all with covariates. Thus, the sample restriction and covariates do not capture additional reasons for the association between the timing of motherhood and child outcomes that the full sample and covariates cannot.

The estimations behind the red dots in Fig. 2 include all observable characteristics. However, Table A.4 in Online Appendix A addresses the question of which variables matter for selection into maternal age at childbirth. The table shows that socio-economic characteristics—as captured by whether the maternal grandmother and grandfather are out of the labor market and have an education above high school—are important for maternal age at childbirth. This is also the case for individual mother characteristics, such as whether a mother had started high school by the age of 17, is a non-western immigrant or descendant, whether her parents lived together, as well as the mother’s and the father’s behavioral characteristics.

To sum up, Fig. 2 shows that observable characteristics eliminate part of the association between child outcomes and maternal age. However, a substantial (for reading and math) and significant (for both reading, math and birth weight) effect remains. One explanation for the remaining association may be a negative causal effect on child outcomes of being born to a younger mother. Another explanation may be that unobservable characteristics still confound the effect. Examples of such unobservable characteristics are the norms and values present during a mother’s upbringing or a mother’s risky behavior during adolescence. Our empirical strategy addresses these confounders.

3 Empirical strategy

We use two empirical strategies to address the effect of young maternal age on child outcomes. These strategies are commonly used in the literature on the mothers’ own outcomes. The first strategy compares the outcomes of maternal cousins whose mothers timed their first childbirth to different ages. This strategy eliminates differences between mothers due to their own upbringing and family environment. The second strategy exploits a mother’s miscarriage in her first pregnancy as a natural experiment that mechanically increases a mother’s age at first childbirth. This strategy eliminates differences between mothers due to adolescent risky health behavior and attitude towards teenage motherhood. Below we present the two strategies formally.

3.1 Cousin strategy

This strategy restricts the sample to children with a maternal cousin who also fulfills the sample selection criteria (“cousin sample”)Footnote 10 and runs a regression with maternal grandmother fixed effects:

$$Y_{ij} = \alpha + \beta \cdot Teenmom_{ij} + \gamma \cdot X_{ij} + \lambda _j + \in _{ij}$$
(1)

Yij is the outcome of child i in family j (as identified by the maternal grandmother) and Teenmomij is an indicator for whether the mother of child i in family j became pregnant with child i while still a teenager (maximum age 19). Thus, mothers aged 20 at childbirth may be either teenage mothers or non-teenage mothers, while ages below 20 uniquely identify teenage mothers and ages above 20 uniquely identify non-teenage mothers. The coefficient of main interest is β, which is identified from pairs of cousins where one has a mother who became a teenage mother while the other has a mother who had a child between age 20 (+nine months) and 26. The cousins born to a non-teenage mother are the counterfactuals for the children of teenage mothers. Xij includes individual characteristics of child i that vary within family j and are thought to affect both selection into teenage motherhood and child outcomes. λj is maternal grandmother fixed effects and ϵij is an idiosyncratic error term.

The individual characteristics include child gender and birth year, individual characteristics of the mother and the father. The individual controls include socio-demographic background characteristics of the maternal and paternal grandparents (age at first childbirth, an indicator for being unemployed or out of the labor market, and an indicator for whether they have education beyond high school) and behavioral characteristics of the mother and the father. The behavioral characteristics include indicators for treatment for chlamydia, indicators for mental health status and treatment (receiving a psychiatric diagnosis, attending a psychologist or attending a psychiatrist), and specifically for the mother, indicators for use of contraceptives (the pill or LARC) and attending OB/GYN. For both the mother and the father we measure behavioral characteristics by age 18 in order to have comparable measures for teenage mothers as well as the comparison group. Note that the characteristics of the father are generally pre-determined at the birth of the child since they only rarely are below 19 years old. We gradually add the characteristics when we present the overall results. Table 2 report means of all background characteristics for the identifying cousin sample.

Table 2 Background characteristics for the identifying samples

The strength of the cousin strategy is that it eliminates variation due to family characteristics shared by the children’s mothers, such as variation in the mothers’ upbringing and family environment, including norms and values. However, for the cousin strategy to provide credibly causal estimates, Xij should capture all selection into teenage motherhood between sisters that also affects how well their children fare. This is a strong assumption and one weakness of the strategy is that sisters may vary along dimensions not captured by observable variables. For example, Houmark et al. (2022) show that there is substantial variation in genetic potential for educational attainment within families. Another weakness of the strategy is that sisters may influence one another directly through their everyday interactions. One sister’s decision to become a teenage mother may influence the other sister’s behavior or fertility decision above and beyond what is already accounted for by conditioning on that sister’s behavior (see Holmlund, 2005). While it is unlikely that Xij captures all selection into teenage motherhood, we believe that the availability of behavioral characteristics (e.g., use of contraceptives, treatment for chlamydia, mental health) and own birthweight renders the assumption relatively plausible. We include the rich controls gradually in order to understand how they account for selection.

3.2 Miscarriage strategy

This strategy restricts the sample to mothers who became pregnant as teenagers and where the first pregnancy ended with a childbirth or a miscarriage (excluding abortions) (“miscarriage sample”). The strategy compares a child with a mother who gave birth in her first pregnancy to a child with a mother who miscarried in her first pregnancy. The miscarriage mechanically increases the mother’s age at first childbirth (to no more than 26 years in our analysis).

The estimating equation is:

$$Y_i = \alpha + \beta \cdot Birth_i + \gamma \cdot X_i + \in _i$$
(2)

Yi is the outcome of child i and Birthi is an indicator that takes the value 1, if child i’s mother gave birth in her first pregnancy and 0 if she had a miscarriage. The coefficient of main interest is β, which is identified from variation between children whose mothers gave birth in their first pregnancy and children whose mothers did not. Teenage mothers give birth up to age 20 (but became pregnant with the child at a maximum age of 19), whereas “non-teenage mothers” may give birth at all ages at or below 26 (including below 20). Thus, this strategy does not examine a teenage/non-teenage cutoff. Instead, it examines whether mothers give birth or not at the early age that would be implied if the pregnancy was carried to term. Xi is a vector of covariates that affect both whether a mother gives birth in her first pregnancy and child outcomes. We include the same characteristics as for the cousin strategy as well as maternal grandmother and grandfather characteristics. However, we measure a mother’s behavioral characteristics prior to the first pregnancy and not at age 18. ϵi is an idiosyncratic error term. Means of background characteristics are included in Table 2.

The strength of the miscarriage strategy is that it eliminates bias from selection into teenage pregnancy as we compare mothers who became pregnant as teenagers only. To interpret the estimates causally, the strategy relies on two assumptions: (i) miscarriages are random conditional on covariates and (ii) all women who miscarried wanted the child (i.e., they did not intend to have an abortion but to give birth).

Regarding assumption (i), observed miscarriages may be non-random due to medical reasons or due to reasons related to registration of miscarriages. From a medical point of view, the cause of most miscarriages is unknown, and medical studies suggest that miscarriages result from a complex interplay between parental age and genetic, hormonal, immunological, and environmental factors. More than half of all miscarriages may be linked to chromosomal defects. High maternal age is a risk factor, but this is not a concern in our setting as all pregnancies occur in the teenage years. Other risk factors include infectious diseases, anatomic abnormalities, diabetes, and use of drugs, alcohol, and tobacco (Magnus et al., 2019; Andersen et al., 2000).

Miscarriages may also be non-random due to misreporting or imperfect registration. Misreporting is an inherent challenge for the strategy, but the universal health care coverage combined with registry data make information on miscarriages and abortions highly reliable, which avoids problems with self-reporting and imperfect recall.Footnote 11 Furthermore, home pregnancy tests, which are key in detecting a pregnancy, were common during the entire study period. We find it likely that the importance of home pregnancy tests carries over to observing a miscarriage in registries, because in the Danish context we expect that women will see a doctor when they realize they have had a miscarriage. Even if this is not the case, teenage pregnancies are rarely intended in the first placeFootnote 12 and therefore unrecognized early miscarriages are not so different from no conception at all.Footnote 13,Footnote 14

Assumption (ii) requires that all women who miscarried actually wanted a child—i.e., that they did not want an abortion. Essentially, a woman has two choices when she realizes she is pregnant: give birth to a child or have an abortion. Online Appendix Table A.2 shows that mothers who choose to give birth are different from mothers who choose an abortion in their first teenage pregnancy; i.e., women choosing an abortion are positively selected. However, we do not observe the choice between birth and abortion for mothers who miscarry. Nevertheless, we see that teenagers who miscarry most often become pregnant soon after the miscarriage, which suggests that a dominant fraction actually wanted a child. In our sample, 935 (corresponding to 38%) become teenage mothers in a later pregnancy and 1770 (corresponding to 72%) have a child by age 22 (see Online Appendix Table A.2. and Table 2). There is, however, still room for a substantial fraction, who might have wanted an abortion.

Ashcraft et al. (2013) propose using miscarriages to create bounds on the true consistent estimate. The OLS strategy (the one we use) provides a lower bound on the estimated effect, and an IV strategy provides an upper bound on the estimated effect. The interpretation of the OLS strategy as providing a lower bound on the estimate comes from the fact that the OLS strategy uses all mothers who miscarry as a counterfactual for mothers who give birth. If some mothers who miscarry wanted an abortion—instead of giving birth—we would assume them (and their child, in this setting) to have better outcomes than teenage mothers (and their children). Thus, they do not constitute a suitable comparison group and would bias the estimate downwards. We focus on how much of the large negative association between teenage mothers and child outcomes can be attributed to selection. Thus, if we can eliminate a significant negative effect via a lower-bound estimation this suggests that the negative association is due to selection.Footnote 15

3.3 Comparison of strategies

In order to further examine the identifying assumptions for the two strategies, we compare background characteristics across treatment status for the two samples. Table 2 shows background characteristics for the identifying sample for the cousin sample and for the miscarriage sample.

For the cousin sample, we see that 16% of teenage mothers had started high school by age 17 compared to 25% of non-teenage mothers. More strikingly, teenage mothers had a systematically different behavior in their teenage years. They more often used the pill (81.5% compared to 77.0%), used Long-Acting Reversible Contraception (LARC) (8.5% compared to 4.7%), were treated for chlamydia (19.2% compared to 16.6%), and had a psychiatric diagnosis (9.9% compared to 6.7%). In addition, the father of their first child was less likely to have started high school by age 17 (8.8% compared to 13.9%) and the paternal grandfather less often had an education above high school (6.1% compared to 8.6%). In our regression analysis, we control for all the background variables listed in Table 2.

For the miscarriage sample,Footnote 16 we find a more balanced sample along most dimensions. The teenager who gave birth after the first pregnancy and the teenager who experienced a miscarriage have very similar high school enrollment at age 17 (19.3% and 19.8%), the same probability of living in an unbroken family (48.7% and 48.6%), and similar grandparental background characteristics. Their use of contraception before their first pregnancy was similar (77.2% and 77.8% for the pill, and 3.9% and 4.5% used LARC). However, teenagers who gave birth were less likely to have been treated for chlamydia (17.7% compared to 20.9%).Footnote 17 In addition, 9.1% of maternal grandmothers had an education above high school in both cases. Hence, it is evident that the teenagers who experienced a miscarriage are very similar to those who gave birth after the first pregnancy, and both of these groups are vastly different from those choosing an abortion. In the latter case, 27.0% were enrolled in high school at age 17, and 13.1% of maternal grandmothers had an education above high school (see Online Appendix Table A.2). These observations speak in favor of the miscarriage approach.

Online Appendix Table A.4 inspects the balance formally in regressions of the treatment indicator on all the background characteristics. For the miscarriage sample, the characteristics are typically well balanced. However, for the cousin sample the regressions show a poor balance for the individual characteristics; the mothers who became teenage mothers are more likely to have used the pill and LARC, been treated for chlamydia, and received a psychiatric diagnosis. This is true to a much smaller extent for the miscarriage sample, where only the coefficients for the pill and psychologist/psychiatrist visits are significant but much smaller in magnitude.

Non-teenage mothers vary greatly in terms of age at childbirth, and the length of childbirth delay is important for the interpretation of the results. In the third and sixth columns of Table 2, we report selected background characteristics for the non-teenage mothers who give birth no later than at age 22 and mothers who miscarried but still had a child no later than at age 22. These mothers have more unfavorable background characteristics than the full sample of counterfactual women, and for the cousin sample they appear to be more comparable to teenage mothers as regards the mothers’ and fathers’ individual characteristics. Note, however, that the grandparental characteristics are different compared to those of the teenage mothers because the families are different.

3.4 Length of delay

To investigate how results vary by the length of delay, we split the counterfactual (non-teenage mothers) into indicator variables for maternal age at childbirth.

For the cousin strategy, the estimating equation becomes:

$$\begin{array}{l}Y_{ij} = \alpha + {\beta _1}Momage20_{ij} + {\beta _2}Momage21_{ij} + \cdots \\ +\, {\beta _7}Momage26_{ij} + \gamma X_{ij} + \lambda _j + \in _{ij}\end{array}$$
(3)

For the miscarriage strategy, the estimating equation becomes:

$$\begin{array}{l}Y_i = \alpha + \beta _1Momage20_i + \beta _2Momage21_i + \cdots \\ \qquad\quad\,+ \,\beta _7Momage26_i + \gamma X_i + \in _i\end{array}$$
(4)

In both strategies, Y, X, and ϵ are defined as for models (1) and (2). MomageZ, for Z = 20,…26 are indicators taking the value 1 if a mother gave birth at age Z. The omitted category is teenage mothers and identification comes from comparing the outcomes of children of teenage mothers to children with mothers giving birth at age Z. This part of the empirical analysis is merely descriptive due to potential selection into length of delay.

4 Results

First, we present the overall results from the two strategies. Then, we present figures showing the variation by length of delay.

4.1 Overall results

Table 3 shows the overall results for the cousin strategy (columns 1–4) and the miscarriage strategy (columns 5–8). We include control variables gradually. The first column shows the raw correlations, adjusting only for child gender and birth year. In the estimations in the second column, controls for characteristics of the maternal grandparents are added (for the cousin strategy, this corresponds to including the maternal grandmother fixed effects). In the third and fourth columns, we have controlled for individual characteristics of the mother and father, respectively.

Table 3 Overall results

The raw OLS estimates for the cousin sample in column (1) largely mimic Fig. 2, with a teenage mother indicator instead of heterogeneous age effects. In column (2), fixed effects are added to the regression, which reduce the association by around three-quarters for all outcomes but the probability of an injury before age 5, which is only halved. Note that the fixed effects absorb the same amount of the association as the observable characteristics in Fig. 2. Columns (3) and (4) include individual characteristics of the mother and father, which reduces the association further. The father’s characteristics lead to larger reductions.Footnote 18 For test scores, the reduction primarily stems from the indicators for parents’ high school enrollment by age 17; whereas for birth weight and injuries the reduction also stems from the parents’ birth weight and behavioral characteristics (see full set of results in Table A.5). For the final results, all point estimates are small in magnitude and statistically insignificant, with the exception of math scores and injuries. Column (4) indicates that math scores are 8.5% of a SD lower for children of teenage mothers, whereas the probability of injuries is 3.0 percentage points higher for children of teenage mothers. Both estimates are significant at the 10% level.

Column (5) presents raw OLS estimates for the miscarriage strategy. All the effects are small in magnitude and insignificant even without control variables. In general, none of the control variables change the estimates—neither in terms of size nor standard errors. Thus, for the miscarriage strategy narrowing the sample to women at risk of teenage motherhood is the primary driver of the reduction of the negative association between teenage motherhood and child outcomes.Footnote 19

To sum up, we find that the overall conclusions from the cousin strategy and the miscarriage strategy are largely similar. There is no large systematic effect on child outcomes from being born to a teenage mother. The negative association between teenage motherhood and child outcomes stems from a negative selection into teenage motherhood. For the miscarriage strategy, this way of narrowing the sample is sufficient to reach this conclusion. For the cousin strategy, individual control variables of the mother and father are important, and only scant significant effects remain. Interestingly, the similarity of results across the two empirical strategies speaks for the external validity of the results that we reach similar overall conclusions for two different negatively selected control groups. If less detailed control variables were available (perhaps, in a country with less data than Denmark), one might come to a different conclusion.

4.2 Length of delay

Next, we investigate the role of the length of delay of motherhood to follow up on the initial, gradually increasing negative association between child outcomes and a younger mother that was shown in Fig. 1 and the selection into length of delay shown in Table 2. We use Eqs. (3) and (4), where age-specific motherhood indicators replace the dummy indicators for the cousin and miscarriage strategies. Note that this analysis is merely descriptive due to the selection into length of delay.

Figure 3 shows the distribution of the timing of motherhood in the samples. The left-hand histogram for the (full) cousin sample (the gray shaded area) mimics the general fertility pattern in the population; the majority of women give birth after their teenage years and age 26 is still well below the mean age at first childbirth in the general population, which is 29.5 years.Footnote 20 The black lines show the timing of motherhood for the identifying cousin sample, where at least one cousin had a teenage mother and at least one cousin did not. This histogram peaks at ages 19 and 20, which reflects the fact that most teenage mothers have a child close to the teenage cutoff and few have a child at age 17 or younger. For non-teenage mothers, the distribution of delay is relatively flat because sisters of teenage mothers are almost as likely to have a child at age 21 as at age 26. This pattern deviates from the overall increasing fertility pattern in the population.

Fig. 3
figure 3

Distribution of age at childbirth in the samples. The sample for all include singleton, firstborn children born from 2000 to 2017 with a mother born 1974-2005 who was no more than 26 years old. The sample for cousins includes children with a cousin who fulfills the sample criteria through the maternal grandmother, and the identifying sample narrows the cousin sample to cousins with variation around the teenage mother cutoff between the cousins. The sample for teenage pregnant mothers narrows the full sample to mothers who were pregnant as teenagers and then further splits the sample by the way this pregnancy ends: birth, miscarriage, or abortion

The right-hand histogram in Fig. 3 shows the age distribution for teenage pregnant mothers. The gray-shaded area shows the distribution for all teenage pregnant mothers (irrespective of how the pregnancy ends), which peaks at ages 19 and 20. This is because most teenage pregnant mothers become pregnant at age 18 or 19 and have a child in their first pregnancy. When we separate the teenage pregnant sample according to the way the first pregnancy ends, births (up to age 20) and abortions (after age 20) drive the overall pattern. For mothers who miscarry in their first pregnancy the distribution for age at first childbirth peaks at ages 19–22, which reflects that, on average, having a miscarriage causes only a short delay in childbearing. Furthermore, the distribution for mothers who miscarry highlights how the counterfactual group includes mothers who still become teenage mothers.

There is a potential bias due to the censoring at age 26. However, both the identifying cousin sample (left hand side of Fig. 3) and the miscarriage sample (right hand side of Fig. 3) shows a declining age distribution when age approaches 26 years, which suggests that the bias is small.Footnote 21,Footnote 22

Figure 4 shows the results from using age-specific motherhood indicators to examine how effects vary with the counterfactual maternal age group. As there is selection into length of delay, a causal interpretation is not warranted.

Fig. 4
figure 4

Effect of length of delay. Note. Dots show the coefficient from an equation similar to Eq. (3) for the cousin comparison and (4) for the miscarriage strategy. Blue and red dots refer to separate regressions. The full sample includes singleton, firstborn children born from 2000 to 2017 with a mother born 1974–2005 who was no more than 26 years old. The sample for miscarriages narrows this sample to mothers with their first pregnancy between age 12 and 19 ending with a childbirth or miscarriage. The sample for cousins includes children with a cousin who fulfills the sample criteria through the maternal grandmother. Control variables are the variables listed in Table 2 and indicators for missing variables

There are some general observations to be made based on Fig. 4. Firstly, the estimates with and without covariates are never significantly different. The largest differences in the estimates are for the miscarriage strategy for longer delays. Furthermore, the confidence bands for the cousin strategy are narrow compared to those for the miscarriage strategy. In particular, for longer delays the confidence bands are wide for the miscarriage strategy because relatively few women have long delays of childbirth, e.g., only 5% of 2460 women delay childbirth till age 26. The confidence bands for the cousin strategy for age 20 are wide because this bin includes only mothers who became pregnant at age 20 and had the child at age 20. The covariates do not affect the standard errors.

For birth weight, there is not much of an effect for the cousin strategy of any length of delay, while for the miscarriage strategy the estimates are jumpy around zero and noisy with large standard errors. For injuries, the cousin strategy shows a significant, negative effect without covariates, but the covariates eliminate the effect. The effect size does not vary much with length of delay beyond 20. For the miscarriage strategy, the estimates are again jumpy around zero and noisy with no significant effect for any age, neither with nor without covariates.

For reading, the cousin strategy shows an upward trend and significant positive effects at ages 23–26, but the inclusion of covariates eliminate this effect. For the miscarriage strategy, the estimates are positive from age 22 and show a similar upward trend with larger point estimates. Even covariates cannot eliminate this effect at ages 24 and 26.

For math, the results for the cousin strategy again show an upward trend, and coefficients are now significantly positive at ages 24–26, even when covariates are included. The same upward pattern is apparent for the miscarriage strategy until age 24 and coefficients are of a similar size, but not significant.

In the overall results in Table 3, the primary specification includes controls for the father’s characteristics. In this case, the father is considered a pre-determined choice at the time of determination of teenage motherhood status. However, in a more dynamic setting that allows for re-optimization the choice of father can be seen as endogenous. Thus, Online Appendix Tables A.6 and A.7 show results with and without father characteristics as controls. The tables show that it is primarily the fathers’ characteristics that drive the reduction of estimates for longer delays when covariates are added.

To sum up, we find that the estimated effects of being born to a teenage mother depend on the counterfactual, in particular the age of the mother constituting the counterfactual. For short delays—up to around age 22—, there are no discernable negative effects of teenage motherhood on child outcomes. Again, the conclusions across the two strategies are surprisingly similar, despite the different assumptions, samples, and validity concerns.

5 Discussion and conclusion

This paper investigates the effects of teenage motherhood on the next generation, the children. We document a strong negative association between teenage motherhood and both health outcomes and educational outcomes of the children, as well as a strong selection into teenage motherhood. Controlling for the observable characteristics cannot completely eliminate the negative association. To address the remaining selection on unobservable characteristics, we employ two commonly used empirical strategies. One strategy eliminates differences in child outcomes due to differences in the mother’s upbringing and family environment, by comparing cousins whose mothers were sisters but timed their first childbirth differently. The other strategy addresses the mother’s selection into teenage pregnancy by comparing the firstborn children of mothers who both became pregnant as teenagers, but where one mother gave birth to a child while the other miscarried. Despite the different identifying assumptions, the strategies find strikingly similar results, namely that the large negative association between having a teenage mother and child outcomes almost disappears. This is the case for both health and educational outcomes of the children.

We show that there is selection into length of delay on observable characteristics and investigate how this factors into the result. When we use women delaying motherhood to their early twenties—up to around 22 years—as a counterfactual for teenage mothers, the effects of such delays are nil across outcomes for both strategies. Previous recent studies also find limited (Aizer et al. 2022, using the cousin strategy) or no effects (Basu and Gorry 2021, using miscarriage strategy) for high-income countries. The fact that we find largely similar results for the two strategies when employed in the same context, substantiates the overall conclusion and speaks to the external validity of the results.

For policy, our findings imply that it is important to target other factors than age itself, for instance more planning and preparation of motherhood.