当前位置: X-MOL 学术Curr. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using Chaos-Game-Representation for Analysing the SARS-CoV-2 Lineages, Newly Emerging Strains and Recombinants
Current Genomics ( IF 2.6 ) Pub Date : 2023-10-24 , DOI: 10.2174/0113892029264990231013112156
Amarinder Singh Thind 1, 2 , Somdatta Sinha 1
Affiliation  

Background: Viruses have high mutation rates, facilitating rapid evolution and the emergence of new species, subspecies, strains and recombinant forms. Accurate classification of these forms is crucial for understanding viral evolution and developing therapeutic applications. Phylogenetic classification is typically performed by analyzing molecular differences at the genomic and sub-genomic levels. This involves aligning homologous proteins or genes. However, there is growing interest in developing alignment-free methods for whole-genome comparisons that are computationally efficient. Methods: Here we elaborate on the Chaos Game Representation (CGR) method, based on concepts of statistical physics and free of sequence alignment assumptions. We adopt the CGR method for classification of the closely related clades/lineages A and B of the SARS-Corona virus 2019 (SARS-CoV-2), which is one of the fastest evolving viruses. Results: Our study shows that the CGR approach can easily yield the SARS-CoV-2 phylogeny from the available whole genomes of lineage A and lineage B sequences. It also shows an accurate classification of eight different strains and the newly evolved XBB variant from its parental strains. Compared to alignment-based methods (Neighbour-Joining and Maximum Likelihood), the CGR method requires low computational resources, is fast and accurate for long sequences, and, being a K-mer based approach, allows simultaneous comparison of a large number of closely-related sequences of different sizes. Further, we developed an R pipeline CGRphylo, available on GitHub, which integrates the CGR module with various other R packages to create phylogenetic trees and visualize them. Conclusion: Our findings demonstrate the efficacy of the CGR method for accurate classification and tracking of rapidly evolving viruses, offering valuable insights into the evolution and emergence of new SARS-CoV-2 strains and recombinants.

中文翻译:

使用混沌博弈表示分析 SARS-CoV-2 谱系、新出现的毒株和重组体

背景:病毒具有高突变率,促进快速进化和新物种、亚种、毒株和重组形式的出现。这些形式的准确分类对于了解病毒进化和开发治疗应用至关重要。系统发育分类通常通过分析基因组和亚基因组水平的分子差异来进行。这涉及对齐同源蛋白质或基因。然而,人们越来越有兴趣开发计算效率高的全基因组比较的免比对方法。方法:在这里,我们基于统计物理学的概念并且不考虑序列比对假设,详细阐述了混沌博弈表示(CGR)方法。我们采用 CGR 方法对 2019 年 SARS 冠状病毒 (SARS-CoV-2) 密切相关的进化枝/谱系 A 和 B 进行分类,该病毒是进化最快的病毒之一。结果:我们的研究表明,CGR 方法可以轻松地从 A 谱系和 B 谱系序列的可用全基因组中得出 SARS-CoV-2 系统发育。它还显示了八种不同菌株以及从其亲本菌株中新进化出的 XBB 变体的准确分类。与基于比对的方法(邻域连接和最大似然)相比,CGR 方法需要较低的计算资源,对于长序列快速且准确,并且作为基于 K-mer 的方法,允许同时比较大量紧密的序列-不同大小的相关序列。此外,我们开发了一个 R 管道 CGRphylo(可在 GitHub 上获取),它将 CGR 模块与各种其他 R 包集成,以创建系统发育树并将其可视化。结论:我们的研究结果证明了 CGR 方法在准确分类和跟踪快速进化病毒方面的有效性,为新 SARS-CoV-2 毒株和重组体的进化和出现提供了宝贵的见解。
更新日期:2023-10-24
down
wechat
bug