当前位置: X-MOL 学术Stroke Vasc. Neurol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development of machine learning-based models to predict 10-year risk of cardiovascular disease: a prospective cohort study
Stroke and Vascular Neurology ( IF 5.9 ) Pub Date : 2023-12-01 , DOI: 10.1136/svn-2023-002332
Jia You , Yu Guo , Ju-Jiao Kang , Hui-Fu Wang , Ming Yang , Jian-Feng Feng , Jin-Tai Yu , Wei Cheng

Background Previous prediction algorithms for cardiovascular diseases (CVD) were established using risk factors retrieved largely based on empirical clinical knowledge. This study sought to identify predictors among a comprehensive variable space, and then employ machine learning (ML) algorithms to develop a novel CVD risk prediction model. Methods From a longitudinal population-based cohort of UK Biobank, this study included 473 611 CVD-free participants aged between 37 and 73 years old. We implemented an ML-based data-driven pipeline to identify predictors from 645 candidate variables covering a comprehensive range of health-related factors and assessed multiple ML classifiers to establish a risk prediction model on 10-year incident CVD. The model was validated through a leave-one-center-out cross-validation. Results During a median follow-up of 12.2 years, 31 466 participants developed CVD within 10 years after baseline visits. A novel UK Biobank CVD risk prediction (UKCRP) model was established that comprised 10 predictors including age, sex, medication of cholesterol and blood pressure, cholesterol ratio (total/high-density lipoprotein), systolic blood pressure, previous angina or heart disease, number of medications taken, cystatin C, chest pain and pack-years of smoking. Our model obtained satisfied discriminative performance with an area under the receiver operating characteristic curve (AUC) of 0.762±0.010 that outperformed multiple existing clinical models, and it was well-calibrated with a Brier Score of 0.057±0.006. Further, the UKCRP can obtain comparable performance for myocardial infarction (AUC 0.774±0.011) and ischaemic stroke (AUC 0.730±0.020), but inferior performance for haemorrhagic stroke (AUC 0.644±0.026). Conclusion ML-based classification models can learn expressive representations from potential high-risked CVD participants who may benefit from earlier clinical decisions. Data are available in a public, open access repository. All data used in this study were accessed from the publicly available UK Biobank Resource under application number 19542. These data cannot be shared with other investigators.

中文翻译:

开发基于机器学习的模型来预测 10 年心血管疾病风险:一项前瞻性队列研究

背景 以前的心血管疾病(CVD)预测算法是使用主要基于经验临床知识检索的风险因素建立的。本研究试图在综合变量空间中识别预测因子,然后采用机器学习 (ML) 算法开发新型 CVD 风险预测模型。方法 这项研究来自英国生物银行的一个基于人群的纵向队列,包括 473 611 名年龄在 37 岁至 73 岁之间的无 CVD 参与者。我们实施了基于 ML 的数据驱动管道,从涵盖全面健康相关因素的 645 个候选变量中识别预测因子,并评估了多个 ML 分类器,以建立 10 年 CVD 事件的风险预测模型。该模型通过留一中心交叉验证进行验证。结果 在中位随访 12.2 年期间,31 466 名参与者在基线就诊后 10 年内出现了 CVD。建立了一种新颖的英国生物银行CVD风险预测(UKCRP)模型,该模型包含10个预测因素,包括年龄、性别、胆固醇和血压药物、胆固醇比率(总/高密度脂蛋白)、收缩压、既往心绞痛或心脏病、服用的药物数量、胱抑素 C、胸痛和吸烟年数。我们的模型获得了令人满意的判别性能,受试者工作特征曲线下面积 (AUC) 为 0.762±0.010,优于多个现有的临床模型,并且经过良好校准,Brier 评分为 0.057±0.006。此外,UKCRP对于心肌梗塞(AUC 0.774±0.011)和缺血性中风(AUC 0.730±0.020)可以获得相当的性能,但对于出血性中风(AUC 0.644±0.026)性能较差。结论 基于 ML 的分类模型可以从潜在的高风险 CVD 参与者中学习表达表征,这些参与者可能会从早期的临床决策中受益。数据可在公共、开放访问存储库中获取。本研究中使用的所有数据均来自公开可用的英国生物银行资源,申请号为 19542。这些数据不能与其他研究人员共享。
更新日期:2023-12-01
down
wechat
bug