当前位置: X-MOL 学术Protein J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Choice of Metric Divergence in Genome Sequence Comparison
The Protein Journal ( IF 3 ) Pub Date : 2024-03-16 , DOI: 10.1007/s10930-024-10189-x
Soumen Ghosh , Jayanta Pal , Bansibadan Maji , Carlo Cattani , Dilip Kumar Bhattacharya

The paper introduces a novel probability descriptor for genome sequence comparison, employing a generalized form of Jensen-Shannon divergence. This divergence metric stems from a one-parameter family, comprising fractions up to a maximum value of half. Utilizing this metric as a distance measure, a distance matrix is computed for the new probability descriptor, shaping Phylogenetic trees via the neighbor-joining method. Initial exploration involves setting the parameter at half for various species. Assessing the impact of parameter variation, trees drawn at different parameter values (half, one-fourth, one-eighth). However, measurement scales decrease with parameter value increments, with higher similarity accuracy corresponding to lower scale values. Ultimately, the highest accuracy aligns with the maximum parameter value of half. Comparative analyses against previous methods, evaluating via Symmetric Distance (SD) values and rationalized perception, consistently favor the present approach's results. Notably, outcomes at the maximum parameter value exhibit the most accuracy, validating the method's efficacy against earlier approaches.



中文翻译:

基因组序列比较中度量分歧的选择

该论文采用詹森-香农散度的广义形式,介绍了一种用于基因组序列比较的新颖概率描述符。这一散度度量源自一个单参数族,其中包含最大值为一半的分数。利用该度量作为距离度量,为新的概率描述符计算距离矩阵,通过邻居连接方法塑造系统发育树。最初的探索涉及将不同物种的参数设置为一半。评估参数变化的影响,以不同参数值(一半、四分之一、八分之一)绘制树木。然而,测量尺度随着参数值的增加而减小,相似精度越高对应于尺度值越低。最终,最高精度与最大参数值的一半一致。与以前的方法进行比较分析,通过对称距离(SD)值和合理化的感知进行评估,一致支持本方法的结果。值得注意的是,最大参数值的结果表现出最高的准确性,验证了该方法相对于早期方法的有效性。

更新日期:2024-03-16
down
wechat
bug