Skip to main content
Log in

Signed rearrangement distances considering repeated genes, intergenic regions, and indels

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

Genome rearrangement distance problems allow to estimate the evolutionary distance between genomes. These problems aim to compute the minimum number of mutations called rearrangement events necessary to transform one genome into another. Two commonly studied rearrangements are the reversal, which inverts a sequence of genes, and the transposition, which exchanges two consecutive sequences of genes. Seminal works on this topic focused on the sequence of genes and assumed that each gene occurs exactly once on each genome. More realistic models have been assuming that a gene may have multiple copies or may appear in only one of the genomes. Other models also take into account the nucleotides between consecutive pairs of genes, which are called intergenic regions. This work combines all these generalizations defining the signed intergenic reversal distance (SIRD), the signed intergenic reversal and transposition distance (SIRTD), the signed intergenic reversal and indels distance (SIRID), and the signed intergenic reversal, transposition, and indels distance (SIRTID) problems. We show a relation between these problems and the signed minimum common intergenic string partition (SMCISP) problem. From such relation, we derive \(\varTheta (k)\)-approximation algorithms for the SIRD and the SIRTD problems, where k is maximum number of copies of a gene in the genomes. These algorithms also work as heuristics for the SIRID and SIRTID problems. Additionally, we present some parametrized algorithms for SMCISP that ensure constant approximation factors for the distance problems. Our experimental tests on simulated genomes show an improvement on the rearrangement distances with the use of the partition algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Algorithm 1
Algorithm 2

Similar content being viewed by others

Data Availability

Enquiries about data availability should be directed to the authors.

Notes

  1. https://github.com/compbiogroup/Signed-Rearrangement-Distances-Considering-Repeated-Genes-Intergenic-Regions-and-Indels.

References

  • Alexandrino AO, Brito KL, Oliveira AR, Dias U, Dias Z (2021a) Reversal distance on genomes with different gene content and intergenic regions information. In: Algorithms for computational biology, vol 12715. Springer, Berlin, pp 121–133

  • Alexandrino AO, Oliveira AR, Dias U, Dias Z (2021b) Genome rearrangement distance with reversals, transpositions, and indels. J Comput Biol 28(3):235–247

  • Alexandrino AO, Oliveira AR, Dias U, Dias Z (2021c) Incorporating intergenic regions into reversal and transposition distances with indels. J Bioinform Comput Biol 19(06):2140011

  • Biller P, Guéguen L, Knibbe C, Tannier E (2016a) Breaking good: accounting for fragility of genomic regions in rearrangement distance estimation. Genome Biol Evol 8(5):1427–1439

  • Biller P, Knibbe C, Beslon G, Tannier E (2016b) Comparative genomics on artificial life. In: Pursuit of the universal. Springer, Berlin, pp. 35–44

  • Brito KL, Jean G, Fertin G, Oliveira AR, Dias U, Dias Z (2020) Sorting by genome rearrangements on both gene order and intergenic sizes. J Comput Biol 27(2):156–174

    Article  MathSciNet  Google Scholar 

  • Brito KL, Oliveira AR, Alexandrino AO, Dias U, Dias Z (2021) An improved approximation algorithm for the reversal and transposition distance considering gene order and intergenic sizes. Algorithms Mol Biol 16(1):1–21

    Article  Google Scholar 

  • Bulteau L, Fertin G, Komusiewicz C, Rusu I (2013) A fixed-parameter algorithm for minimum common string partition with few duplications. In: Algorithms in bioinformatics. Springer, Berlin, pp 244–258

  • Chen X, Zheng J, Fu Z, Nan P, Zhong Y, Lonardi S, Jiang T (2005) Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans Comput Biol Bioinform 2(4):302–315

    Article  Google Scholar 

  • Cormode G, Muthukrishnan S (2007) The string edit distance matching problem with moves. ACM Trans Algorithms 3(1):1–19

    Article  MathSciNet  MATH  Google Scholar 

  • Goldstein A, Kolman P, Zheng J (2005) Minimum common string partition problem: hardness and approximations. In: Fleischer R, Trippen G (eds) Proceedings of the 15th international symposium on algorithms and computation (ISAAC’2004). Springer, Berlin, pp 484–495

  • Kolman P, Waleń T (2007) Reversal distance for strings with duplicates: linear time approximation using hitting set. In: Erlebach T, Kaklamanis C (eds) Proceedings of the 4th international workshop on approximation and online algorithms (WAOA’2006). Springer, Berlin, pp 279–289

  • Oliveira AR, Brito KL, Dias U, Dias Z (2019) On the complexity of sorting by reversals and transpositions problems. J Comput Biol 26:1223–1229. https://doi.org/10.1089/cmb.2019.0078

    Article  MathSciNet  Google Scholar 

  • Oliveira AR, Jean G, Fertin G, Brito KL, Bulteau L, Dias U, Dias Z (2021a) Sorting signed permutations by intergenic reversals. IEEE/ACM Trans Comput Biol Bioinform 18(6):2870–2876

  • Oliveira AR, Jean G, Fertin G, Brito KL, Dias U, Dias Z (2021b) Sorting permutations by intergenic operations. IEEE/ACM Trans Comput Biol Bioinform 18(6):2080–2093

  • Radcliffe AJ, Scott AD, Wilmer EL (2005) Reversals and transpositions over finite alphabets. SIAM J Discrete Math 19(1):224–244

    Article  MathSciNet  MATH  Google Scholar 

  • Siqueira G, Alexandrino AO, Oliveira AR, Dias Z (2021a) Approximation algorithm for rearrangement distances considering repeated genes and intergenic regions. Algorithms Mol Biol 16(1):1–23

  • Siqueira G, Brito KL, Dias U, Dias Z (2021b) Heuristics for genome rearrangement distance with replicated genes. IEEE/ACM Trans Comput Biol Bioinform 18(6):2094–2108

  • Walter MEMT, Dias Z, Meidanis J (1998) Reversal and transposition distance of linear chromosomes. In: Proceedings of the 5th international symposium on string processing and information retrieval (SPIRE’1998). IEEE Computer Society, Los Alamitos, pp 96–102

  • Willing E, Stoye J, Braga M (2021) Computing the inversion-indel distance. IEEE/ACM Trans Comput Biol Bioinform 18(6):2314–2326

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Council of Technological and Scientific Development, CNPq (grant 202292/2020-7 ), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001, and the São Paulo Research Foundation, FAPESP (grants 2013/08293-7, 2015/11937-9, 2017/12646-3, and 2021/13824-8).

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Siqueira.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this work appeared in thes Proceedings of the 14th International Conference on Bioinformatics and Computational Biology (BICoB 2022).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Siqueira, G., Alexandrino, A.O. & Dias, Z. Signed rearrangement distances considering repeated genes, intergenic regions, and indels. J Comb Optim 46, 16 (2023). https://doi.org/10.1007/s10878-023-01083-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10878-023-01083-w

Keywords

Mathematics Subject Classification

Navigation