当前位置: X-MOL 学术Hum. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Explicable prioritization of genetic variants by integration of rule-based and machine learning algorithms for diagnosis of rare Mendelian disorders
Human Genomics ( IF 4.5 ) Pub Date : 2024-03-21 , DOI: 10.1186/s40246-024-00595-8
Ho Heon Kim , Dong-Wook Kim , Junwoo Woo , Kyoungyeul Lee

In the process of finding the causative variant of rare diseases, accurate assessment and prioritization of genetic variants is essential. Previous variant prioritization tools mainly depend on the in-silico prediction of the pathogenicity of variants, which results in low sensitivity and difficulty in interpreting the prioritization result. In this study, we propose an explainable algorithm for variant prioritization, named 3ASC, with higher sensitivity and ability to annotate evidence used for prioritization. 3ASC annotates each variant with the 28 criteria defined by the ACMG/AMP genome interpretation guidelines and features related to the clinical interpretation of the variants. The system can explain the result based on annotated evidence and feature contributions. We trained various machine learning algorithms using in-house patient data. The performance of variant ranking was assessed using the recall rate of identifying causative variants in the top-ranked variants. The best practice model was a random forest classifier that showed top 1 recall of 85.6% and top 3 recall of 94.4%. The 3ASC annotates the ACMG/AMP criteria for each genetic variant of a patient so that clinical geneticists can interpret the result as in the CAGI6 SickKids challenge. In the challenge, 3ASC identified causal genes for 10 out of 14 patient cases, with evidence of decreased gene expression for 6 cases. Among them, two genes (HDAC8 and CASK) had decreased gene expression profiles confirmed by transcriptome data. 3ASC can prioritize genetic variants with higher sensitivity compared to previous methods by integrating various features related to clinical interpretation, including features related to false positive risk such as quality control and disease inheritance pattern. The system allows interpretation of each variant based on the ACMG/AMP criteria and feature contribution assessed using explainable AI techniques.

中文翻译:

通过集成基于规则和机器学习算法来诊断罕见孟德尔疾病,对遗传变异进行可解释的优先级排序

在寻找罕见疾病的致病变异的过程中,对遗传变异的准确评估和优先排序至关重要。以前的变异优先排序工具主要依赖于对变异致病性的计算机预测,这导致灵敏度低且难以解释优先排序结果。在本研究中,我们提出了一种可解释的变体优先级排序算法,名为 3ASC,具有更高的灵敏度和注释用于优先级排序的证据的能力。 3ASC 使用 ACMG/AMP 基因组解释指南定义的 28 个标准以及与变体临床解释相关的特征来注释每个变体。系统可以根据注释的证据和特征贡献来解释结果。我们使用内部患者数据训练了各种机器学习算法。使用识别顶级变体中致病变体的召回率来评估变体排名的性能。最佳实践模型是随机森林分类器,其前 1 召回率为 85.6%,前 3 召回率为 94.4%。 3ASC 注释了患者每种遗传变异的 ACMG/AMP 标准,以便临床遗传学家可以像 CAGI6 SickKids 挑战中那样解释结果。在挑战中,3ASC 确定了 14 例患者中 10 例的致病基因,其中 6 例有证据表明基因表达下降。其中,转录组数据证实,两个基因(HDAC8 和 CASK)的基因表达谱降低。与之前的方法相比,3ASC 通过整合与临床解释相关的各种特征,包括与假阳性风险相关的特征,如质量控制和疾病遗传模式,可以比以前的方法以更高的灵敏度优先考虑遗传变异。该系统允许根据 ACMG/AMP 标准解释每个变体,并使用可解释的 AI 技术评估特征贡献。
更新日期:2024-03-21
down
wechat
bug