当前位置: X-MOL 学术Thorax › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Radiomics analysis to predict pulmonary nodule malignancy using machine learning approaches
Thorax ( IF 10 ) Pub Date : 2024-04-01 , DOI: 10.1136/thorax-2023-220226
Matthew T Warkentin , Hamad Al-Sawaihey , Stephen Lam , Geoffrey Liu , Brenda Diergaarde , Jian-Min Yuan , David O Wilson , Sukhinder Atkar-Khattra , Benjamin Grant , Yonathan Brhane , Elham Khodayari-Moez , Kiera R Murison , Martin C Tammemagi , Kieran R Campbell , Rayjean J Hung

Background Low-dose CT screening can reduce lung cancer-related mortality. However, most screen-detected pulmonary abnormalities do not develop into cancer and it often remains challenging to identify malignant nodules, particularly among indeterminate nodules. We aimed to develop and assess prediction models based on radiological features to discriminate between benign and malignant pulmonary lesions detected on a baseline screen. Methods Using four international lung cancer screening studies, we extracted 2060 radiomic features for each of 16 797 nodules (513 malignant) among 6865 participants. After filtering out low-quality radiomic features, 642 radiomic and 9 epidemiological features remained for model development. We used cross-validation and grid search to assess three machine learning (ML) models (eXtreme Gradient Boosted Trees, random forest, least absolute shrinkage and selection operator (LASSO)) for their ability to accurately predict risk of malignancy for pulmonary nodules. We report model performance based on the area under the curve (AUC) and calibration metrics in the held-out test set. Results The LASSO model yielded the best predictive performance in cross-validation and was fit in the full training set based on optimised hyperparameters. Our radiomics model had a test-set AUC of 0.93 (95% CI 0.90 to 0.96) and outperformed the established Pan-Canadian Early Detection of Lung Cancer model (AUC 0.87, 95% CI 0.85 to 0.89) for nodule assessment. Our model performed well among both solid (AUC 0.93, 95% CI 0.89 to 0.97) and subsolid nodules (AUC 0.91, 95% CI 0.85 to 0.95). Conclusions We developed highly accurate ML models based on radiomic and epidemiological features from four international lung cancer screening studies that may be suitable for assessing indeterminate screen-detected pulmonary nodules for risk of malignancy. Data are available on reasonable request. All data used in the present study may be made available on reasonable request to the Integrative Analysis of Lung Cancer Etiology and Risk (INTEGRAL) programme on approval by the Data Access Committee. The model reported in the study and example code are publicly available on GitHub ().

中文翻译:

使用机器学习方法预测肺结节恶性肿瘤的放射组学分析

背景低剂量CT筛查可以降低肺癌相关死亡率。然而,大多数筛查检测到的肺部异常不会发展为癌症,并且识别恶性结节通常仍然具有挑战性,特别是在不确定的结节中。我们的目的是开发和评估基于放射学特征的预测模型,以区分基线筛查中检测到的良性和恶性肺部病变。方法 通过四项国际肺癌筛查研究,我们从 6865 名参与者中的 16 797 个结节(513 个恶性)中的每一个提取了 2060 个放射组学特征。过滤掉低质量的放射组学特征后,剩下 642 个放射组学特征和 9 个流行病学特征用于模型开发。我们使用交叉验证和网格搜索来评估三种机器学习 (ML) 模型(极限梯度提升树、随机森林、最小绝对收缩和选择算子 (LASSO))准确预测肺结节恶性肿瘤风险的能力。我们根据曲线下面积 (AUC) 和保留测试集中的校准指标来报告模型性能。结果 LASSO 模型在交叉验证中产生了最佳的预测性能,并且基于优化的超参数适合完整的训练集。我们的放射组学模型的测试集 AUC 为 0.93(95% CI 0.90 至 0.96),并且在结节评估方面优于已建立的泛加拿大肺癌早期检测模型(AUC 0.87,95% CI 0.85 至 0.89)。我们的模型在实性结节(AUC 0.93,95% CI 0.89 至 0.97)和亚实性结节(AUC 0.91,95% CI 0.85 至 0.95)中均表现良好。结论 我们根据四项国际肺癌筛查研究的放射组学和流行病学特征开发了高度准确的 ML 模型,该模型可能适合评估不确定的筛查检测肺结节的恶性肿瘤风险。可根据合理要求提供数据。本研究中使用的所有数据可根据肺癌病因和风险综合分析 (INTEGRAL) 计划的合理要求,经数据访问委员会批准后提供。研究中报告的模型和示例代码可在 GitHub 上公开获取()。
更新日期:2024-03-15
down
wechat
bug