当前位置: X-MOL 学术Virus Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SARS-CoV-2 lineage assignments using phylogenetic placement/UShER are superior to pangoLEARN machine learning method
Virus Evolution ( IF 5.3 ) Pub Date : 2024-01-13 , DOI: 10.1093/ve/vead085
Adriano de Bernardi Schneider 1, 2 , Michelle Su 3 , Angie S Hinrichs 1 , Jade Wang 3 , Helly Amin 3 , John Bell 4 , Debra A Wadford 4 , Áine O’Toole 5 , Emily Scher 5 , Marc D Perry 1 , Yatish Turakhia 6 , Nicola De Maio 7 , Scott Hughes 3 , Russ Corbett-Detig 1, 2
Affiliation  

With the rapid spread and evolution of SARS-CoV-2, the ability to monitor its transmission and distinguish among viral lineages is critical for pandemic response efforts. The most commonly used software for the lineage assignment of newly isolated SARS-CoV-2 genomes is pangolin, which offers two methods of assignment, pangoLEARN and pUShER. PangoLEARN rapidly assigns lineages using a machine learning algorithm, while pUShER performs a phylogenetic placement to identify the lineage corresponding to a newly sequenced genome. In a preliminary study, we observed that pangoLEARN (decision tree model), while substantially faster than pUShER, offered less consistency across different versions of pangolin v3. Here, we expand upon this analysis to include v3 and v4 of pangolin, which moved the default algorithm for lineage assignment from pangoLEARN in v3 to pUShER in v4, and perform a thorough analysis confirming that pUShER is not only more stable across versions but also more accurate. Our findings suggest that future lineage assignment algorithms for various pathogens should consider the value of phylogenetic placement.

中文翻译:

使用系统发育放置/UShER 进行 SARS-CoV-2 谱系分配优于 pangoLEARN 机器学习方法

随着 SARS-CoV-2 的快速传播和进化,监测其传播和区分病毒谱系的能力对于大流行应对工作至关重要。对新分离的 SARS-CoV-2 基因组进行谱系分配最常用的软件是穿山甲,它提供两种分配方法:pangoLEARN 和 pUShER。PangoLEARN 使用机器学习算法快速分配谱系,而 pUShER 执行系统发育放置以识别与新测序的基因组相对应的谱系。在一项初步研究中,我们观察到 pangoLEARN(决策树模型)虽然比 pUShER 快得多,但不同版本的穿山甲 v3 之间的一致性较差。在这里,我们扩展此分析以包括穿山甲的 v3 和 v4,它将谱系分配的默认算法从 v3 中的 pangoLEARN 移至 v4 中的 pUShER,并进行彻底的分析,确认 pUShER 不仅跨版本更稳定,而且更稳定。准确的。我们的研究结果表明,未来各种病原体的谱系分配算法应考虑系统发育放置的价值。
更新日期:2024-01-13
down
wechat
bug