当前位置: X-MOL 学术Paediatric Perinat. Epidemiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Point: Setting realistic expectations for the evaluation of intrauterine growth charts
Paediatric and Perinatal Epidemiology ( IF 2.8 ) Pub Date : 2023-11-03 , DOI: 10.1111/ppe.13017
Alice Hocquette 1 , Jennifer Zeitlin 1
Affiliation  

The question of which growth chart should be used for identifying fetuses and newborns at risk because of suboptimal growth was propelled into the scientific arena by the INTERGROWTH 21st and World Health Organization (WHO) projects to develop intrauterine growth standards (i.e. charts developed in low-risk populations with normal growth) in the mid-2010s.1, 2 Previously, despite broad consensus in the obstetric community on the use of birthweight percentile cut-offs to screen for fetuses or newborns with growth anomalies (below 10th percentile for small for gestational age (SGA) and above 90th percentile for large for gestational age (LGA), with more extreme thresholds for severe cases: <3rd, <5th, >95th, >97th),3 most recommendations did not specify which charts should be used to determine these thresholds. The publication of these new international charts, in tandem with the Intergrowth project's stated aim to produce growth standards for global use, led to a plethora of research studies seeking to establish whether universal thresholds of SGA and LGA exist and to compare the performance of different charts. This research, which uses a variety of evaluation strategies and health outcomes to assess performance, largely refuted the universal standard hypothesis but also revealed the limits of growth charts for predicting health risks.

The study by John et al.4 entitled ‘The clinical performance and population health impact of birthweight-for-gestational age indices at term gestation’ published in this issue of Paediatric and Perinatal Epidemiology adds to this ongoing debate about the choice and performance of intrauterine growth charts. Their study explores the ability of percentile thresholds from different intrauterine growth charts to identify term singleton liveborn infants with severe neonatal morbidity. Three different charts are compared: internal charts derived from their study population (singleton term live births in the United States from 2003 to 2017), the INTERGROWTH 21st newborn charts1 and the WHO estimated fetal weight charts.2 Their results corroborate previous work by showing that the percentages of SGA and LGA newborns differ depending on the chart. Furthermore, in models of the association between percentile values and risks of adverse neonatal outcomes, they find that all charts perform poorly at predicting severe neonatal morbidity at an individual and population level. However, in interpreting these results to conclude about the limits of growth charts, it is important to consider whether expectations are too high.

A first expectation is that intrauterine growth charts could be useful for predicting risks of all newborn morbidity. This study's neonatal morbidity outcome (5-minute Apgar score <4, neonatal seizures, need for assisted ventilation, and neonatal death) was chosen because of its relevance in describing morbidity among term live births. However, neonatal morbidity occurs among normally grown infants and can have no relationship with their growth, especially morbidity resulting from acute events, such as uterine rupture or cord prolapse. Including morbidities resulting from other aetiological mechanisms in the outcome will reduce estimates of a chart's predictive value, as growth percentiles are of no relevance for predicting their occurrence. Having a single composite outcome is also problematic because of differences in the causes and consequences of abnormal growth at the extremes of the percentile distribution. For instance, thresholds selected to predict morbidity resulting from growth restriction (<10th or <3rd percentiles) will not be of use in predicting morbidity from labour dystocia caused by macrosomia, typically defined as a birthweight over 4000 or 4500 g. This suggests the need for specific outcomes when evaluating the predictive value of low and high thresholds of the percentile distribution.

Selecting an outcome that is appropriate for evaluating the performance of growth charts in predicting morbidities associated with fetal growth abnormalities is complex as most neonatal morbidities can have many causes. One solution that we used for evaluating the predictive value of estimated fetal weight charts to detect morbidity associated with fetal growth was to associate the occurrence of neonatal morbidity with low or high birthweight percentiles defined using several thresholds (3rd and 10th percentiles for SGA births and 90th and 97th percentiles for LGA births).5 This definition assumes that morbidities associated with restrictive and excessive growth are more likely to occur among newborns with birthweight below and above these thresholds, respectively. However, it can be criticised because appropriate for gestational age newborns with growth abnormalities will be excluded. New proposed definitions of fetal growth restriction that add clinical or biological criteria (Doppler velocimetry findings, antenatal diagnoses of fetal growth restriction, and placental pathology) and other anthropometric measurements to birthweight percentiles to better distinguish growth restriction from constitutional size could help in refining appropriate outcomes for prediction studies.6-8

A second expectation is that observational studies of birthweight can provide robust estimates of a chart's performance. Growth monitoring is a component of antenatal care in all high-income countries and leads to interventions to prevent stillbirth and neonatal morbidity. Antenatal suspicion of growth restriction or fetal overgrowth changes obstetric care and can influence gestational age and birthweight (the exposures) and morbidity (the outcome) in prediction studies. The complexity of investigating this question is exacerbated in the sub-population of term live births, as growth screening affects selection into the sample. Effective screening of growth abnormalities can result in decisions to induce delivery before term (whereby the newborn would not be in this sample) or early term (newborn is in the sample, but risks of severe neonatal morbidity are mitigated). In an optimal, albeit unrealistic, scenario where all cases of suboptimal growth are detected and appropriately managed, birthweight percentiles could have no value for predicting severe neonatal morbidity at term, despite the high screening performance of growth charts. Stillbirths also create selection biases when only live births are studied, since high stillbirth rates may reduce risks of adverse neonatal outcomes in the liveborn population and conversely indicate the delivery of a fetus with severe growth abnormalities to avoid a stillbirth might increase neonatal morbidity.

Despite the difficulties in interpreting the study's results about the predictive value of growth charts, the study by John et al. provides important knowledge about risks associated with birthweight across the birthweight-for-gestational age percentile distribution in the understudied population of term newborns. The authors employed innovative methods to resolve issues of ounce and digit preferences and to model the association between birthweight percentiles and neonatal morbidity at each week of gestation using nonlinear regression models. Their models make it possible to calculate, for each gestational age, the percentile at which morbidity is lowest and the percentiles at which morbidity odds increase by 10%, 50% or 100%. They find that severe neonatal morbidity is only substantially affected by very low and very high birthweight percentiles, with about 2% of infants at each extreme having higher rates of these adverse outcomes. Across other percentiles, risks are stable. These results illustrate that severe morbidity at term is not associated with birthweight for most infants born in the US with birthweights over the 3rd percentile and less than the 97th percentile. These results can orient preventive action, including evaluating growth screening programmes, and could serve as a benchmark for international comparisons with countries that have different screening practices and different levels of term mortality and morbidity.

Another contribution of this study is to show that the percentages of SGA and LGA fetuses differ depending on the chart used, but that risk patterns remain the same. This finding of variation in SGA/LGA percentages between charts has led professional and scientific societies to urge that charts be validated to ensure that they ‘fit’ the population before local use.9, 10 Local validation studies were also recommended by the WHO group when they published their estimated fetal weight charts.2 In this more straightforward approach to evaluating performance (previously, the main one employed by researchers developing charts), the aim is to assess whether the chart accurately describes the birthweight (or estimated fetal weight) distribution of the population in which it will be applied. In other words, are 10% of fetuses/newborns below the 10th and above the 90th percentiles?

These evaluations do not provide any information about health risks, but they make it possible to quantify the number of fetuses or infants who would be flagged for closer surveillance in screening programmes using different thresholds. Furthermore, charts that accurately describe the population provide valuable information about risks associated with percentile thresholds for clinical care and research. This knowledge can inform screening decisions and serve as a basis for developing more complex screening algorithms using other clinical or biological parameters or adapted to selected populations, for instance, high-risk versus low-risk populations. This approach turns the attention of evaluation studies away from generic measures of chart performance (selecting the ‘best’ chart) to a focus on the use of charts for improving obstetric and neonatal care and ultimately child health and development. While designing research to answer these questions is challenging, it provides a more realistic framework for selecting the outcomes and populations for studies to evaluate the clinical performance and health impact of growth charts.



中文翻译:

要点:为宫内生长图的评估设定切合实际的期望

INTERGROWTH 21st 和世界卫生组织 (WHO) 旨在制定宫内生长标准的项目(即在低生育期开发的图表)将应使用哪种生长图来识别因生长欠佳而面临风险的胎儿和新生儿的问题推入了科学领域。正常增长的风险人群)在 2010 年代中期。1, 2此前,尽管产科界就使用出生体重百分位数临界值筛查生长异常的胎儿或新生儿(小于胎龄 (SGA) 的胎儿或新生儿低于第 10 个百分位,大于胎龄的百分位数高于第 90 个百分位)达成了广泛共识,但年龄(LGA),严重病例有更极端的阈值:<3rd、<5th、>95th、>97th),3大多数建议没有指定应使用哪些图表来确定这些阈值。这些新的国际图表的发布,与 Intergrowth 项目制定供全球使用的增长标准的既定目标相结合,引发了大量研究,旨在确定 SGA 和 LGA 的通用阈值是否存在,并比较不同图表的性能。这项研究使用各种评估策略和健康结果来评估表现,在很大程度上反驳了普遍标准假设,但也揭示了生长图表在预测健康风险方面的局限性。

约翰等人的研究。本期《儿科和围产期流行病学》发表的题为“足月妊娠时出生体重别孕龄指数的临床表现和人口健康影响”的第 4 期《儿科和围产期流行病学》进一步加剧了有关宫内生长图的选择和表现的持续争论。他们的研究探讨了不同宫内生长图的百分位阈值识别具有严重新生儿发病率的足月单胎活产婴儿的能力。比较了三个不同的图表:来自研究人群的内部图表(2003 年至 2017 年美国的单胎足月活产婴儿)、INTERGROWTH 21st 新生儿图表 1 和 WHO 估计的胎儿体重图表2他们的结果证实了之前的工作,显示 SGA 和 LGA 新生儿的百分比因图表而异。此外,在百分位值与新生儿不良结局风险之间的关联模型中,他们发现所有图表在预测个人和群体水平的严重新生儿发病率方面表现不佳。然而,在解释这些结果以得出增长图表的局限性的结论时,重要的是要考虑期望是否太高。

第一个期望是宫内生长图可用于预测所有新生儿发病的风险。选择本研究的新生儿发病率结果(5 分钟阿普加评分 <4、新生儿癫痫发作、需要辅助通气和新生儿死亡)是因为它与描述足月活产的发病率相关。然而,新生儿发病发生在正常生长的婴儿中,与其生长无关,尤其是由急性事件(如子宫破裂或脐带脱垂)引起的发病。在结果中包括由其他病因机制引起的发病率将降低图表预测价值的估计,因为生长百分位数与预测其发生无关。由于百分位分布极端情况下异常生长的原因和后果存在差异,因此单一的复合结果也是有问题的。例如,选择用于预测生长受限导致的发病率的阈值(<10%或<3%)将不能用于预测巨大儿引起的难产(通常定义为出生体重超过4000或4500克)的发病率。这表明在评估百分位分布的低阈值和高阈值的预测值时需要特定的结果。

选择适合评估生长图在预测与胎儿生长异常相关的发病方面的性能的结果很复杂,因为大多数新生儿发病可能有多种原因。我们用于评估估计胎儿体重图表的预测价值以检测与胎儿生长相关的发病率的一种解决方案是将新生儿发病率的发生与使用多个阈值定义的低或高出生体重百分位数(SGA 出生的第 3 个和第 10 个百分位数以及第 90 个百分位数)相关联。以及 LGA 出生的第 97 个百分位数)。5该定义假设出生体重低于和高于这些阈值的新生儿更有可能发生与限制性生长和过度生长相关的疾病。然而,它可能会受到批评,因为适合胎龄且生长异常的新生儿将被排除在外。新提出的胎儿生长受限定义,将临床或生物学标准(多普勒测速结果、胎儿生长受限的产前诊断和胎盘病理学)和其他人体测量值添加到出生体重百分位数,以更好地区分生长受限与体格大小,可能有助于完善适当的结果用于预测研究。6-8

第二个期望是,出生体重的观察性研究可以对图表的表现提供可靠的估计。生长监测是所有高收入国家产前护理的一个组成部分,并导致预防死产和新生儿发病的干预措施。产前对生长受限或胎儿过度生长的怀疑会改变产科护理,并可能影响预测研究中的胎龄和出生体重(暴露)和发病率(结果)。在足月活产子群体中,调查这个问题的复杂性更加严重,因为生长筛查会影响样本的选择。对生长异常的有效筛查可以导致决定在足月前(此时新生儿不会出现在该样本中)或早产(新生儿出现在样本中,但新生儿严重发病的风险会降低)进行引产。在最佳(尽管不切实际)的情况下,所有生长欠佳的病例都被检测到并进行适当管理,尽管生长图的筛查性能很高,但出生体重百分位数对于预测足月新生儿的严重发病率没有任何价值。当仅研究活产时,死产也会产生选择偏差,因为高死产率可能会降低活产人群中不良新生儿结局的风险,相反表明为了避免死产而分娩具有严重生长异常的胎儿可能会增加新生儿发病率。

尽管解释生长图表的预测价值的研究结果存在困难,约翰等人的研究。提供了有关足月新生儿群体中出生体重与胎龄百分位分布的出生体重相关风险的重要知识。作者采用创新方法来解决盎司和数字偏好问题,并使用非线性回归模型对出生体重百分位数与妊娠每周新生儿发病率之间的关联进行建模。他们的模型可以计算每个胎龄发病率最低的百分位数以及发病率增加 10%、50% 或 100% 的百分位数。他们发现,严重的新生儿发病率仅受到非常低和非常高的出生体重百分位的显着影响,每个极端的婴儿中约有 2% 的这些不良后果的发生率较高。在其他百分位数中,风险是稳定的。这些结果表明,对于大多数在美国出生的出生体重超过第 3 个百分位数且低于第 97 个百分位数的婴儿来说,足月严重发病率与出生体重无关。这些结果可以指导预防行动,包括评估生长筛查计划,并可以作为与具有不同筛查做法和不同足月死亡率和发病率水平的国家进行国际比较的基准。

这项研究的另一个贡献是表明,SGA 和 LGA 胎儿的百分比根据所使用的图表而有所不同,但风险模式保持不变。图表之间 SGA/LGA 百分比差异的发现导致专业和科学协会敦促对图表进行验证,以确保它们在本地使用之前“适合”人群。9, 10世界卫生组织小组在发布估计胎儿体重图表时也推荐了当地验证研究。2在这种更直接的评估表现的方法中(以前是研究人员开发图表时采用的主要方法),目的是评估图表是否准确地描述了应用该图表的人群的出生体重(或估计胎儿体重)分布。换句话说,是否有 10% 的胎儿/新生儿处于第 10 个百分位数以下和第 90 个百分位数以上?

这些评估不提供任何有关健康风险的信息,但可以量化在使用不同阈值的筛查计划中标记为进行更密切监测的胎儿或婴儿的数量。此外,准确描述人群的图表提供了与临床护理和研究百分位阈值相关的风险的宝贵信息。这些知识可以为筛查决策提供信息,并作为使用其他临床或生物学参数开发更复杂的筛查算法的基础,或适应选定的人群,例如高风险与低风险人群。这种方法将评估研究的注意力从图表性能的通用衡量标准(选择“最佳”图表)转移到使用图表来改善产科和新生儿护理,并最终改善儿童健康和发育。虽然设计研究来回答这些问题具有挑战性,但它提供了一个更现实的框架来选择研究的结果和人群,以评估生长图的临床表现和健康影响。

更新日期:2023-11-07
down
wechat
bug