当前位置: X-MOL 学术Microb. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae
Microbial Genomics ( IF 3.9 ) Pub Date : 2024-03-26
Gherard Batisti Biffignandi, Leonid Chindelevitch, Marta Corbella, Edward J. Feil, Davide Sassera and John A. Lees

Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.

中文翻译:

优化肺炎克雷伯菌最低抑菌浓度的机器学习预测

最低抑制浓度 (MIC) 是定量测量抗生素耐药性的金标准。然而,基于实验室的 MIC 测定可能非常耗时且重现性低,并且敏感或耐药的解释依赖于随时间变化的指南。基因组测序和机器学习有望将计算机MIC 预测作为一种替代方法,克服其中一些困难,尽管仍然需要对 MIC 进行解释。然而,在处理预测模型时,我们究竟应该如何处理 MIC 数据仍不清楚,因为它们是半定量测量的,具有不同的分辨率,并且通常也在不同的范围内进行左右删失。因此,我们使用具有模拟半定量性状和真实 MIC 的 4367 个基因组,研究了基于基因组的肺炎克雷伯菌病原体 MIC 预测。由于我们专注于临床解释,因此我们使用可解释的机器学习模型而不是黑盒机器学习模型,即弹性网络、随机森林和线性混合模型。生成模拟性状,考虑不同遗传力水平的寡基因、多基因和同质遗传效应。然后,我们评估了当 MIC 被构建为回归和分类时模型预测准确性受到的影响。我们的结果表明,根据可用抗生素浓度水平的数量对 MIC 进行不同的处理是最有前途的学习策略。具体来说,为了优化预测准确性和正确因果变异的推断,我们建议将 MIC 视为连续的,并在观察到的抗生素浓度水平数量较大时将学习问题视为回归问题,而在观察到的抗生素浓度水平数量较少时,将学习问题视为回归问题。应被视为一个分类变量,并且学习问题应被视为一个分类。我们的研究结果还强调,由于每种抗生素抗性性状的遗传结构不同,当考虑到先前的生物学知识时,可以如何改进预测模型。最后,我们强调,增加人口数据库对于这些模型的未来临床实施以支持基于机器学习的常规诊断至关重要。
更新日期:2024-03-27
down
wechat
bug