当前位置: X-MOL 学术Clin. Transl. Oncol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Explainable and visualizable machine learning models to predict biochemical recurrence of prostate cancer
Clinical and Translational Oncology ( IF 3.4 ) Pub Date : 2024-04-11 , DOI: 10.1007/s12094-024-03480-x
Wenhao Lu , Lin Zhao , Shenfan Wang , Huiyong Zhang , Kangxian Jiang , Jin Ji , Shaohua Chen , Chengbang Wang , Chunmeng Wei , Rongbin Zhou , Zuheng Wang , Xiao Li , Fubo Wang , Xuedong Wei , Wenlei Hou

Purpose

Machine learning (ML) models presented an excellent performance in the prognosis prediction. However, the black box characteristic of ML models limited the clinical applications. Here, we aimed to establish explainable and visualizable ML models to predict biochemical recurrence (BCR) of prostate cancer (PCa).

Materials and methods

A total of 647 PCa patients were retrospectively evaluated. Clinical parameters were identified using LASSO regression. Then, cohort was split into training and validation datasets with a ratio of 0.75:0.25 and BCR-related features were included in Cox regression and five ML algorithm to construct BCR prediction models. The clinical utility of each model was evaluated by concordance index (C-index) values and decision curve analyses (DCA). Besides, Shapley Additive Explanation (SHAP) values were used to explain the features in the models.

Results

We identified 11 BCR-related features using LASSO regression, then establishing five ML-based models, including random survival forest (RSF), survival support vector machine (SSVM), survival Tree (sTree), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and a Cox regression model, C-index were 0.846 (95%CI 0.796–0.894), 0.774 (95%CI 0.712–0.834), 0.757 (95%CI 0.694–0.818), 0.820 (95%CI 0.765–0.869), 0.793 (95%CI 0.735–0.852), and 0.807 (95%CI 0.753–0.858), respectively. The DCA showed that RSF model had significant advantages over all models. In interpretability of ML models, the SHAP value demonstrated the tangible contribution of each feature in RSF model.

Conclusions

Our score system provide reference for the identification for BCR, and the crafting of a framework for making therapeutic decisions for PCa on a personalized basis.



中文翻译:

可解释和可视化的机器学习模型来预测前列腺癌的生化复发

目的

机器学习(ML)模型在预后预测方面表现出色。然而,机器学习模型的黑盒特性限制了临床应用。在这里,我们的目标是建立可解释和可视化的 ML 模型来预测前列腺癌 (PCa) 的生化复发 (BCR)。

材料和方法

共有 647 名 PCa 患者进行了回顾性评估。使用 LASSO 回归确定临床参数。然后,将队列以0.75:0.25的比例分为训练数据集和验证数据集,并将BCR相关特征纳入Cox回归和五种ML算法中以构建BCR预测模型。通过一致性指数(C-index)值和决策曲线分析(DCA)评估每个模型的临床效用。此外,沙普利加性解释(SHAP)值用于解释模型中的特征。

结果

我们使用 LASSO 回归确定了 11 个与 BCR 相关的特征,然后建立了 5 个基于 ML 的模型,包括随机生存森林(RSF)、生存支持向量机(SSVM)、生存树(sTree)、梯度提升决策树(GBDT)、极端梯度增强 (XGBoost) 和 Cox 回归模型,C 指数分别为 0.846 (95% CI 0.796–0.894)、0.774 (95% CI 0.712–0.834)、0.757 (95% CI 0.694–0.818)、0.820 (95% CI 分别为 0.765–0.869)、0.793 (95%CI 0.735–0.852) 和 0.807 (95%CI 0.753–0.858)。 DCA 表明 RSF 模型比所有模型具有显着优势。在 ML 模型的可解释性方面,SHAP 值证明了 RSF 模型中每个特征的实际贡献。

结论

我们的评分系统为 BCR 的识别以及为 PCa 个性化治疗决策制定框架提供参考。

更新日期:2024-04-11
down
wechat
bug