当前位置: X-MOL 学术J. Med. Microbiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Microbiome depiction through user-adapted bioinformatic pipelines and parameters
Journal of Medical Microbiology ( IF 3 ) Pub Date : 2023-10-12 , DOI: 10.1099/jmm.0.001756
Eric I Nayman 1, 2 , Brooke A Schwartz 1, 2 , Fantaysia C Polanco 2 , Alexandra K Firek 3 , Alayna C Gumabong 1 , Nolan J Hofstee 1 , Giri Narasimhan 2, 4 , Trevor Cickovski 2 , Kalai Mathee 1, 4
Affiliation  

Introduction. The role of the microbiome in health and disease continues to be increasingly recognized. However, there is significant variability in the bioinformatic protocols for analysing genomic data. This, in part, has impeded the potential incorporation of microbiomics into the clinical setting and has challenged interstudy reproducibility. In microbial compositional analysis, there is a growing recognition for the need to move away from a one-size-fits-all approach to data processing. Gap Statement. Few evidence-based recommendations exist for setting parameters of programs that infer microbiota community profiles despite these parameters significantly impacting the accuracy of taxonomic inference. Aim. To compare three commonly used programs (DADA2, QIIME2, and mothur) and optimize them into four user-adapted pipelines for processing paired-end amplicon reads. We aim to increase the accuracy of compositional inference and help standardize microbiomic protocol. Methods. Two key parameters were isolated across four pipelines: filtering sequence reads based on a whole-number error threshold (maxEE) and truncating read ends based on a quality score threshold (QTrim). Closeness of sample inference was then evaluated using a mock community of known composition. Results. We observed that raw genomic data lost were proportionate to how stringently parameters were set. Exactly how much data were lost varied by pipeline. Accuracy of sample inference correlated with increased sequence read retention. Falsely detected taxa and unaccounted for microbial constituents were unique to pipeline and parameter. Implementation of optimized parameter values led to better approximation of the known mock community. Conclusions. Microbial compositions generated based on the 16S rRNA marker gene should be interpreted with caution. To improve microbial community profiling, bioinformatic protocols must be user-adapted. Analysis should be performed with consideration for the select target amplicon, pipelines and parameters used, and taxa of interest.

中文翻译:

通过用户适应的生物信息学管道和参数描述微生物组

介绍。微生物组在健康和疾病中的作用不断得到越来越多的认识。然而,用于分析基因组数据的生物信息学协议存在显着差异。这在一定程度上阻碍了微生物组学融入临床环境的可能性,并对研究间的可重复性提出了挑战。在微生物成分分析中,人们越来越认识到需要放弃一刀切的数据处理方法。差距声明。尽管这些参数显着影响分类学推断的准确性,但很少有基于证据的建议来设置推断微生物群落概况的程序参数。目的。比较三个常用程序(DADA2、QIIME2 和 mothur),并将它们优化为四个用户适应的管道,用于处理双端扩增子读取。我们的目标是提高成分推断的准确性并帮助标准化微生物组协议。方法。两个关键参数在四个管道中被隔离:基于整数错误阈值(maxEE)过滤序列读取和基于质量得分阈值(QTrim)截断读取末端。然后使用已知组成的模拟社区来评估样本推断的紧密度。结果。我们观察到,原始基因组数据的丢失与参数设置的严格程度成正比。到底丢失了多少数据因管道而异。样本推断的准确性与序列读取保留的增加相关。错误检测的分类单元和下落不明的微生物成分是管道和参数所特有的。优化参数值的实现可以更好地近似已知的模拟社区。结论。应谨慎解释基于 16S rRNA 标记基因生成的微生物组合物。为了改善微生物群落分析,生物信息学方案必须适应用户。进行分析时应考虑选择的目标扩增子、使用的管道和参数以及感兴趣的分类单元。
更新日期:2023-10-13
down
wechat
bug