Skip to main content
Log in

Biological computation and computational biology: survey, challenges, and discussion

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Biological computation involves the design and development of computational techniques inspired by natural biota. On the other hand, computational biology involves the development and application of computational techniques to study biological systems. We present a comprehensive review showcasing how biology and computer science can guide and benefit each other, resulting in improved understanding of biological processes and at the same time advances in the design of algorithms. Unfortunately, integration between biology and computer science is often challenging, especially due to the cultural idiosyncrasies of these two communities. In this study, we aim at highlighting how nature has inspired the development of various algorithms and techniques in computer science, and how computational techniques and mathematical modeling have helped to better understand various fields in biology. We identified existing gaps between biological computation and computational biology and advocate for bridging this gap between “wet” and “dry” research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://www.asimovinstitute.org/blog/.

  2. https://www.asimovinstitute.org/author/fjodorvanveen/.

  3. A jumping library is a set of pairs mate-pair reads derived from long fragments of DNA. A mate-pair read is a pair of sequence reads from a single fragment of DNA (often the distance between the reads is approximately known).

  4. An allele is a variant form of a given gene. Individuals who are heterozygous for a certain gene carry two different alleles.

  5. Mechanisms of breakpoint origination remain a mystery, but the idea of genes playing the role of “solid” regions can be explained by the fact that their functionality, if lost, may result in the death of the cell, and thus a cell whose genomes have such “broken” genes would most likely not procreate.

  6. Most of the figures and some of the materials presented in this section are taken from Bayzid (2016).

  7. https://hadoop.apache.org/.

  8. https://storm.apache.org/.

  9. https://samza.apache.org/.

  10. https://spark.apache.org/.

  11. https://flink.apache.org/.

  12. https://blast.ncbi.nlm.nih.gov/Blast.cgi.

  13. https://beast.community/.

  14. https://www.megasoftware.net/.

  15. https://cme.h-its.org/exelixis/software.html/.

  16. https://wolbachia.biology.virginia.edu/WuLab/Software.html.

  17. https://www.biomolecular-modeling.com/Ascalaph/.

  18. https://www.dnastar.com/.

  19. https://www2.decipher.codes/.

  20. https://meme-suite.org/.

  21. https://kappalanguage.org/.

  22. https://cellsignaling.lanl.gov/bionetgen/.

  23. https://www.biospice.org/.

  24. https://www.insilico-biotechnology.com/discovery_en.html.

  25. https://www.biorxiv.org/.

  26. https://arxiv.org.

  27. https://ncatlab.org/nlab/show/HomePage.

  28. https://oeis.org/..

  29. https://git-scm.com/.

  30. https://fossil-scm.org/.

  31. https://veracity-scm.com/.

  32. https://www.mercurial-scm.org/.

  33. https://www.monotone.ca/.

  34. https://bioperl.org/.

  35. https://biopython.org/.

  36. https://www.open-bio.org/wiki/BOSC.

  37. https://www.open-bio.org/.

  38. https://bioconda.github.io/.

  39. https://snakemake.readthedocs.io.

References

  • Aganezov S, Sitdykova N, Alekseyev MA, Consortium A et al (2015) Scaffold assembly based on genome rearrangement analysis. Comput Biol Chem 57:46–53

    Article  Google Scholar 

  • Aganezov S, Sitdykova N, Alekseyev MA (2015) Scaffold assembly based on genome rearrangement analysis. Computational Biology and Chemistry 57:46–53. https://doi.org/10.1016/j.compbiolchem.2015.02.005. https://www.sciencedirect.com/science/article/pii/S1476927115000225. 13th Asia Pacific bioinformatics conference, HsinChu, Taiwan, 21-23 January 2015

  • Aickelin U, Dasgupta D (2005) Artificial immune systems. In: search methodologies, pp. 375–399. Springer. https://link.springer.com/chapter/10.1007/0-387-28356-0_13

  • Alba E (2006) Parallel evolutionary computations. Springer, Berlin

    MATH  Google Scholar 

  • Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002) Integrins. In: Molecular biology of the cell. 4th edn. Garland Science. https://www.ncbi.nlm.nih.gov/books/NBK26867/

  • Alekseyev MA, Pevzner PA (2007) Whole genome duplications, multi-break rearrangements, and genome halving problem. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007), pp. 665–679. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA

  • Alekseyev MA, Pevzner PA (2008) Multi-break rearrangements and chromosomal evolution. Theor Comput Sci 395(2):193–202. https://doi.org/10.1016/j.tcs.2008.01.013

    Article  MathSciNet  MATH  Google Scholar 

  • Alekseyev MA, Pevzner PA (2009) Breakpoint graphs and ancestral genome reconstructions. Genome Res 19(5):943–957

    Article  Google Scholar 

  • Alexeev N, Alekseyev MA (2017) Estimation of the true evolutionary distance under the fragile breakage model. BMC Genomics 18(4):356. https://doi.org/10.1186/s12864-017-3733-3

    Article  Google Scholar 

  • Alic AS, Ruzafa D, Dopazo J, Blanquer I (2016) Objective review of de novo stand-alone error correction methods for NGS data. Wiley Interdiscip Rev Comput Mol Sci 6(2):111–146

    Article  Google Scholar 

  • Ané C, Larget B, Baum DA, Smith SD, Rokas A (2007) Bayesian estimation of concordance among gene trees. Mol Biol Evol 24:412–426

    Article  Google Scholar 

  • Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Mol Syst Biol 12(7):878

    Article  Google Scholar 

  • Angermueller C, Pärnamaa T, Parts L, Stegle O (2016a) Deep learning for computational biology. Mol Syst Biol 12(7):878

    Article  Google Scholar 

  • Anselmetti Y, Luhmann N, Bérard S, Tannier E, Chauve C (2018) Comparative methods for reconstructing ancient genome organization. Springer, New York, pp 343–362. https://doi.org/10.1007/978-1-4939-7463-4_13

    Book  Google Scholar 

  • Avdeyev P, Jiang S, Aganezov S, Hu F, Alekseyev MA (2016) Reconstruction of ancestral genomes in presence of gene gain and loss. J Comput Biol 23(3):150–164. https://doi.org/10.1089/cmb.2015.0160

    Article  MathSciNet  Google Scholar 

  • Avdeyev P, Alexeev N, Rong Y, Alekseyev MA (2017) A unified ILP framework for genome median, halving, and aliquoting problems under DCJ. In: Meidanis J, Nakhleh L (eds.) Proceedings of 15th international workshop on comparative genomics (RECOMB-CG), lecture notes in computer science, vol. 10562, pp 156–178

  • Bafna V, Pevzner P (1996) Genome rearrangements and sorting by reversals. SIAM J Comput 25(2):272–289. https://doi.org/10.1137/S0097539793250627

    Article  MathSciNet  MATH  Google Scholar 

  • Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD et al (2012) Spades: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19(5):455–477

    Article  MathSciNet  Google Scholar 

  • Bao E, Jiang T, Girke T (2014) Aligngraph: algorithm for secondary de novo genome assembly guided by closely related references. Bioinformatics 30(12):i319–i328. https://doi.org/10.1093/bioinformatics/btu291

    Article  Google Scholar 

  • Bartels D, Kespohl S, Albaum S, Drüke T, Goesmann A, Herold J, Kaiser O, Pühler A, Pfeiffer F, Raddatz G et al (2004) Baccardi-a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison. Bioinformatics 21(7):853–859

    Article  Google Scholar 

  • Bartocci E, Lió P (2016) Computational modeling, formal analysis, and tools for systems biology. PLoS Comput Biol 12(1):e1004591

    Article  Google Scholar 

  • Bashir A, Klammer AA, Robins WP, Chin CS, Webster D, Paxinos E, Hsu D, Ashby M, Wang S, Peluso P et al (2012) A hybrid approach for the automated finishing of bacterial genomes. Nat Biotechnol 30(7):701–707

    Article  Google Scholar 

  • Bayzid MS (2016) Estimating species trees from gene trees despite gene tree incongruence under realistic model conditions. Ph.D. thesis

  • Bayzid MS, Warnow T (2013) Naive binning improves phylogenomic analyses. Bioinformatics 29(18):2277–2284

    Article  Google Scholar 

  • Bayzid MS, Warnow T (2018) Gene tree parsimony for incomplete gene trees: addressing true biological loss. Algorithms Mol Biol 13:1

    Article  Google Scholar 

  • Bayzid MS, Mirarab S, Warnow T (2013) Inferring optimal species trees under gene duplication and loss. Proc Pac Symp Biocomput 18:250–261

    Google Scholar 

  • Beller T, Ohlebusch E (2015) Efficient construction of a compressed de bruijn graph for pan-genome analysis. In: Annual symposium on combinatorial pattern matching. Springer, pp 40–51

  • Aarts E, Korst J (1989) Simulated annealing and boltzmann machines a stochastic approach to combinatorial optimization and neural computing. John Wiley & Sons, Inc. https://dl.acm.org/doi/abs/10.5555/61990

  • Ben-Bassat I, Chor B (2014) String graph construction using incremental hashing. Bioinformatics 30(24):3515–3523

    Article  Google Scholar 

  • Bergeron A, Mixtacki J, Stoye J (2006) A unifying view of genome rearrangements. In: International Workshop on Algorithms in Bioinformatics. Springer, pp 163–173

  • Berlin K, Koren S, Chin CS, Drake JP, Landolin JM, Phillippy AM (2015) Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol 33(6):623

    Article  Google Scholar 

  • Bickhart DM, Rosen BD, Koren S, Sayre BL, Hastie AR, Chan S, Lee J, Lam ET, Liachko I, Sullivan ST et al (2017) Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat Genet 49(4):643

    Article  Google Scholar 

  • Biller P, Guéguen L, Knibbe C, Tannier E (2016) Breaking good: accounting for fragility of genomic regions in rearrangement distance estimation. Genome Biol Evol 8(5):1427–1439. https://doi.org/10.1093/gbe/evw083

    Article  Google Scholar 

  • Bitam S, Batouche M, Talbi EG (2010) A survey on bee colony algorithms. In: Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE international symposium on. IEEE, pp 1–8

  • Boetzer M, Pirovano W (2014) SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information. BMC Bioinform 15:211

    Article  Google Scholar 

  • Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W (2011) Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27(4):578–579

    Article  Google Scholar 

  • Bonabeau E, Marco DdRDF, Dorigo M, Théraulaz G, Theraulaz G et al (1999) Swarm intelligence: from natural to artificial systems. 1. Oxford University Press, Oxford

    Book  MATH  Google Scholar 

  • Bosi E, Donati B, Galardini M, Brunetti S, Sagot MF, Lió P, Crescenzi P, Fani R, Fondi M (2015) Medusa: a multi-draft based scaffolder. Bioinformatics 31(15):2443–2451

    Article  Google Scholar 

  • Bourlard H, Kamp Y (1988) Auto-association by multilayer perceptrons and singular value decomposition. Biol Cybern 59(4–5):291–294

    Article  MathSciNet  MATH  Google Scholar 

  • Bourque G, Pevzner PA (2002) Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res 12(1):26–36

    Google Scholar 

  • Boussau B, Szöllősi GJ, Duret L, Gouy M, Tannier E, Daubin V (2013) Genome-scale coestimation of species and gene trees. Genome Res 23(2):323–330

    Article  Google Scholar 

  • Boutillier P, Maasha M, Li X, Medina-Abarca HF, Krivine J, Feret J, Cristescu I, Forbes AG, Fontana W (2018) The kappa platform for rule-based modeling. Bioinformatics 34(13):i583–i592

    Article  Google Scholar 

  • Braga MD, Stoye J (2010) The solution space of sorting by DCJ. J Comput Biol 17(9):1145–1165

    Article  MathSciNet  Google Scholar 

  • Broomhead DS, Lowe D (1988) Radial basis functions, multi-variable functional interpolation and adaptive networks. Tech. rep, Royal Signals and Radar Establishment Malvern (United Kingdom)

  • Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, McMahon A, Morales J, Mountjoy E, Sollis E et al (2019) The nhgri-ebi gwas catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucl Acids Res 47(D1):D1005–D1012

    Article  Google Scholar 

  • Burnet SFM et al (1959) The clonal selection theory of acquired immunity. Vanderbilt University Press, Nashville

    Book  Google Scholar 

  • Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31(12):1119–1125

    Article  Google Scholar 

  • Bush WS, Moore JH (2012) Genome-wide association studies. PLoS Comput Biol 8(12):e1002822

    Article  Google Scholar 

  • Bush RM, Bender CA, Subbarao K, Cox NJ, Fitch WM (1999) Predicting the evolution of human influenza A. Science 286(5446):1921–1925

    Article  Google Scholar 

  • Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB (2008) Allpaths: de novo assembly of whole-genome shotgun microreads. Genome Res 18(5):810–820

    Article  Google Scholar 

  • Cao X, Qiao H, Xu Y (2007) Negative selection based immune optimization. Adv Eng Softw 38(10):649–656

    Article  Google Scholar 

  • Cazaux B, Lecroq T, Rivals E (2014) From indexing data structures to de bruijn graphs. In: Symposium on combinatorial pattern matching, pp. 89–99. Springer

  • Chaisson MJ, Pevzner PA (2007) Short read fragment assembly of bacterial genomes. Genome Res 18(2):324–330

    Article  Google Scholar 

  • Chambers LD (2000) The practical handbook of genetic algorithms: applications. Chapman and Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • Chaudhary R, Bansal MS, Wehe A, Fernández-Baca D, Eulenstein O (2010) iGTP: a software package for large-scale gene tree parsimony analysis. BMC Bioinform 1(1):574

    Article  Google Scholar 

  • Chauve C, Tannier E (2008) A methodological framework for the reconstruction of contiguous regions of ancestral genomes and its application to mammalian genomes. PLoS Comput Biol 4(11):e1000234

    Article  MathSciNet  Google Scholar 

  • Chauve C, Gavranovic H, Ouangraoua A, Tannier E (2010) Yeast ancestral genome reconstructions: the possibilities of computational methods II. J Comput Biol 17(9):1097–1112

    Article  MathSciNet  Google Scholar 

  • Chauve C, Ponty Y, Zanetti JPP (2015) Evolution of genes neighborhood within reconciled phylogenies: an ensemble approach. BMC Bioinform 16(19):S6

    Article  Google Scholar 

  • Chelly Z, Elouedi Z (2016) A survey of the dendritic cell algorithm. Knowl Inf Syst 48(3):505–535

    Article  Google Scholar 

  • Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Ferrero E, Agapow PM, Zietz M, Hoffman MM et al (2018) Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 15(141):20170387

    Article  Google Scholar 

  • Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  • Clerc M (2010) Particle swarm optimization. Wiley, New Jersey

    MATH  Google Scholar 

  • Coello CAC, Lamont GB (2004) Applications of multi-objective evolutionary algorithms. World Scientific, Chennai

    Book  MATH  Google Scholar 

  • Collins FS, Varmus H (2015) A new initiative on precision medicine. N Engl J Med 372(9):793–795

    Article  Google Scholar 

  • Compeau P, Pevzner P (2018) Bioinformatics algorithms: an active learning approach. Active Learning Publishers, La Jolla

    Google Scholar 

  • Consortium I.H.G.S et al (2001) Initial sequencing and analysis of the human genome. Nature 409(6822):860

    Article  Google Scholar 

  • Consortium I.H et al (2003) The international hapmap project. Nature 426(6968):789

    Article  Google Scholar 

  • Consortium I.H et al (2005) A haplotype map of the human genome. Nature 437(7063):1299

    Article  Google Scholar 

  • Consortium WTCC et al (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3000 shared controls. Nature 447(7145):661

    Article  Google Scholar 

  • Consortium G.P et al (2010) A map of human genome variation from population-scale sequencing. Nature 467(7319):1061

    Article  Google Scholar 

  • Consortium G.P et al (2012) An integrated map of genetic variation from 1092 human genomes. Nature 491(7422):56

    Article  Google Scholar 

  • Consortium G.P et al (2015) A global reference for human genetic variation. Nature 526(7571):68

    Article  Google Scholar 

  • Conway TC, Bromage AJ (2011) Succinct data structures for assembling large genomes. Bioinformatics 27(4):479–486

    Article  Google Scholar 

  • Crisp MD, Trewick SA, Cook LG (2011) Hypothesis testing in biogeography. Trends Ecol Evol 26(2):66–72

    Article  Google Scholar 

  • Dagdia ZC (2018) A distributed dendritic cell algorithm for big data. In: Proceedings of the genetic and evolutionary computation conference companion, pp. 103–104

  • Dagdia ZC (2019) A scalable and distributed dendritic cell algorithm for big data classification. Swarm Evolut Comput 50:100432

    Article  Google Scholar 

  • Dalke K (2003) In court, scientists map a murder weapon. Genome News Network. https://www.genomenewsnetwork.org/articles/01_03/hiv.shtml

  • Darwin C (2004) On the origin of species, 1859. Routledge, Abingdon

    Book  Google Scholar 

  • Dasgupta D, Michalewicz Z (2013) Evolutionary algorithms in engineering applications. Springer Science & Business Media, Heidelberg

    MATH  Google Scholar 

  • Dayarian A, Michael TP, Sengupta AM (2010) SOPRA: scaffolding algorithm for paired reads via statistical optimization. BMC Bioinform 11:345

    Article  Google Scholar 

  • De Castro LN, Timmis J (2002) Artificial immune systems: a new computational intelligence approach. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  • De Jong K (2005) Genetic algorithms: a 30 year perspective. In: Perspectives on adaptation in natural and artificial systems, vol 11. https://books.google.fr/books?hl=en&lr=&id=Ipqoj6mUDnQC&oi=fnd&pg=PA11&dq=Genetic+algorithms:+a+30+year+perspective&ots=F2aEsfUKXR&sig=q6G5hak0kUFBQpx_D8HTqmYLW0&redir_esc=y

  • De Jong KA, Spears WM (1990) An analysis of the interacting roles of population size and crossover in genetic algorithms. In: International conference on parallel problem solving from nature. Springer, pp. 38–47

  • Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, New Jersey

    MATH  Google Scholar 

  • DeGiorgio M, Degnan JH (2010) Fast and consistent estimation of species trees using supermatrix rooted triples. Mol Biol Evol 27(3):552–569

    Article  Google Scholar 

  • Degnan JH, Salter LA (2005) Gene tree distributions under the coalescent process. Evolution : International Journal of Organic Evolution 59(1):24–37. https://view.ncbi.nlm.nih.gov/pubmed/15792224

  • Degnan JH, Rosenberg NA (2006) Discordance of species trees with their most likely gene trees. PLoS Genet 2:762–768

    Article  Google Scholar 

  • Dinh H, Rajasekaran S (2011) A memory-efficient data structure representing exact-match overlap graphs with application for next-generation dna assembly. Bioinformatics 27(14):1901–1907

    Article  Google Scholar 

  • Dobzhansky T (2013) Nothing in biology makes sense except in the light of evolution. Am Biol Teach 75(2):87–91

    Google Scholar 

  • Dobzhansky T, Sturtevant AH (1938) Inversions in the chromosomes of drosophila pseudoobscura. Genetics 23(1):28

    Article  Google Scholar 

  • Dole M, Mack LL, Hines RL, Mobley RC, Ferguson LD, Alice MB (1968) Molecular beams of macroions. J Chem Phys 49(5):2240–2249. https://doi.org/10.1063/1.1670391

    Article  Google Scholar 

  • Dorigo M, Di Caro G (1999) Ant colony optimization: a new meta-heuristic. In: Evolutionary computation, 1999. CEC 99. Proceedings of the 1999 congress on, IEEE. vol. 2, pp. 1470–1477

  • Dorigo M, Stützle T (2003) The ant colony optimization metaheuristic: algorithms, applications, and advances. In: Handbook of metaheuristics, pp. 250–285. Springer

  • Drummond AJ, Rambaut A (2007) Beast: Bayesian evolutionary analysis by sampling trees. BMC Evolut Biol 7(1):214

    Article  Google Scholar 

  • Eberbach E (2005) Toward a theory of evolutionary computation. BioSystems 82(1):1–19

    Article  MathSciNet  Google Scholar 

  • Edgar RC (2004) Muscle: multiple sequence alignment with high accuracy and high throughput. Nucl Acids Res 32(5):1792–1797

    Article  Google Scholar 

  • Edman P, Begg G (1967) A protein sequenator. Eur J Biochem 1(1):80–91

    Article  Google Scholar 

  • Edwards SV, Liu L, Pearl DK (2007) High-resolution species trees without concatenation. Proc Natl Acad Sci 104(14):5936–5941

    Article  Google Scholar 

  • Eiben AE, Smith JE et al (2003) Introduction to evolutionary computing. Springer, Berlin

    Book  MATH  Google Scholar 

  • Ellis LL, Huang W, Quinn AM, Ahuja A, Alfrejd B, Gomez FE, Hjelmen CE, Moore KL, Mackay TF, Johnston JS et al (2014) Intrapopulation genome size variation in d. melanogaster reflects life history variation and plasticity. PLoS Genet 10(7):e1004522

    Article  Google Scholar 

  • Elman JL (1990) Finding structure in time. Cognit Sci 14(2):179–211

    Article  Google Scholar 

  • El-Metwally S, Hamza T, Zakaria M, Helmy M (2013) Next-generation sequence assembly: four stages of data processing and computational challenges. PLoS Comput Biol 9(12):e1003345

    Article  Google Scholar 

  • Eusuff M, Lansey K, Pasha F (2006) Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization. Eng Optim 38(2):129–154

    Article  MathSciNet  Google Scholar 

  • Fadista J, Manning AK, Florez JC, Groop L (2016) The (in) famous gwas p-value threshold revisited and updated for low-frequency variants. Eur J Human Genet 24(8):1202–1205

    Article  Google Scholar 

  • Fang C, Shang Y, Xu D (2018) Mufold-ss: New deep inception-inside-inception networks for protein secondary structure prediction. Proteins Struct Funct Bioinform 86(5):592–598

    Article  Google Scholar 

  • Feijão P (2015) Reconstruction of ancestral gene orders using intermediate genomes. BMC Bioinform 16(Suppl 14):S3

    Article  Google Scholar 

  • Feijão P, Araujo E (2016) Fast ancestral gene order reconstruction of genomes with unequal gene content. BMC Bioinform 17(14):413

    Article  Google Scholar 

  • Feijão P, Meidanis J (2009) Scj: a variant of breakpoint distance for which sorting, genome median and genome halving problems are easy. In: Salzberg SL, Warnow T (eds) Algorithms in bioinformatics. Springer, Heidelberg, pp 85–96

    Chapter  Google Scholar 

  • Feijao P, Meidanis J (2011) Scj: a breakpoint-like distance that simplifies several rearrangement problems. IEEE/ACM Trans Comput Biol Bioinform 8(5):1318–1329. https://doi.org/10.1109/TCBB.2011.34

    Article  Google Scholar 

  • Feng B, Lin Y, Zhou L, Guo Y, Friedman R, Xia R, Hu F, Liu C, Tang J (2017) Reconstructing yeasts phylogenies and ancestors from whole genome data. Sci Rep 7(1):15209

    Article  Google Scholar 

  • Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM (1989) Electrospray ionization for mass spectrometry of large biomolecules. Science 246(4926):64–71

    Article  Google Scholar 

  • Fertin G, Labarre A, Rusu I, Vialette S, Tannier E (2009) Combinatorics of genome rearrangements. MIT press, Cambridge

    Book  MATH  Google Scholar 

  • Fisher J, Henzinger TA (2007) Executable cell biology. Nat Biotechnol 25(11):1239

    Article  Google Scholar 

  • Fogel LJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley. https://cds.cern.ch/record/107769

  • Freedman ML, Reich D, Penney KL, McDonald GJ, Mignault AA, Patterson N, Gabriel SB, Topol EJ, Smoller JW, Pato CN et al (2004) Assessing the impact of population stratification on genetic association studies. Nat Genet 36(4):388–393

    Article  Google Scholar 

  • Gagnon Y, Blanchette M, El-Mabrouk N (2012) A flexible ancestral genome reconstruction method based on gapped adjacencies. In: BMC bioinformatics, vol 13. Springer, p S4. https://link.springer.com/article/10.1186/1471-2105-13-S19-S4

  • Galdzicki M, Clancy KP, Oberortner E, Pocock M, Quinn JY, Rodriguez CA, Roehner N, Wilson ML, Adam L, Anderson JC et al (2014) The synthetic biology open language (SBOL) provides a community standard for communicating designs in synthetic biology. Nat Biotechnol 32(6):545

    Article  Google Scholar 

  • Gandomi AH, Yang XS, Alavi AH (2013) Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems. Eng Comput 29(1):17–35

    Article  Google Scholar 

  • Gaul É, Blanchette M (2006) Ordering partially assembled genomes using gene arrangements. In: RECOMB workshop on comparative genomics, pp. 113–128. Springer

  • Gavranović H, Chauve C, Salse J, Tannier E (2011) Mapping ancestral genomes with massive gene loss: a matrix sandwich problem. Bioinformatics 27(13):i257–i265

    Article  Google Scholar 

  • Gawehn E, Hiss JA, Schneider G (2016) Deep learning in drug discovery. Mol Inf 35(1):3–14

    Article  Google Scholar 

  • Ghurye J, Pop M, Koren S, Bickhart D, Chin CS (2017) Scaffolding of long read assemblies using long range contact information. BMC Genom 18(1):527

    Article  Google Scholar 

  • Gibbs RA (2020) The human genome project changed everything. Nat Rev Genet 21(10):575–576

    Article  Google Scholar 

  • Gogarten SM, Bhangale T, Conomos MP, Laurie CA, McHugh CP, Painter I, Zheng X, Crosslin DR, Levine D, Lumley T et al (2012) Gwastools: an r/bioconductor package for quality control and analysis of genome-wide association studies. Bioinformatics 28(24):3329–3331

    Article  Google Scholar 

  • Gonnella G, Kurtz S (2012) Readjoiner: a fast and memory efficient string graph-based sequence assembler. BMC Bioinform 13(1):82

    Article  Google Scholar 

  • González FA, Dasgupta D (2003) Anomaly detection using real-valued negative selection. Genet Progr Evolv Mach 4(4):383–403

    Article  Google Scholar 

  • Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning. MIT press, Cambridge

    MATH  Google Scholar 

  • Goodman M, Czelusniak J, Moore G, Romero-Herrera E, Matsuda G (1979) Fitting the gene lineage into its species lineage: a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zool 28(2):132–163

    Article  Google Scholar 

  • Goodwin BC (1982) Development and evolution. J Theor Biol 97(1):43–55

    Article  Google Scholar 

  • Goodwin S, Gurtowski J, Ethe-Sayers S, Deshpande P, Schatz MC, McCombie WR (2015) Oxford nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome. Genome Res 25(11):1750–1756

    Article  Google Scholar 

  • Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17(6):333

    Article  Google Scholar 

  • Górecki P (2004) Reconciliation problems for duplication, loss and horizontal gene transfer. In: Proceedings of the 8th annual international conference on computational molecular biology, pp. 316 – 325

  • Green P (1997) Against a whole-genome shotgun. Genome Res 7(5):410–417

    Article  Google Scholar 

  • Greensmith J, Aickelin U, Twycross J (2006) Articulation and clarification of the dendritic cell algorithm. In: International conference on artificial immune systems, pp. 404–417. Springer

  • Gritsenko AA, Nijkamp JF, Reinders MJ, de Ridder D (2012) GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies. Bioinformatics 28(11):1429–1437

    Article  Google Scholar 

  • Guigo R, Muchnik I, Smith T (1996) Reconstruction of ancient molecular phylogeny. Mol Phylogenet Evol 6(2):189–213

    Article  Google Scholar 

  • Hackl T, Hedrich R, Schultz J, Förster F (2014) proovread: large-scale high-accuracy pacbio correction through iterative short read consensus. Bioinformatics 30(21):3004–3011

    Article  Google Scholar 

  • Hajela P, Yoo JS (1999) Immune network modelling in design optimization. New ideas in optimization. McGraw-Hill Ltd., New York, pp 203–216

    Google Scholar 

  • Halanych KM, Goertzen LR (2009) Grand challenges in organismal biology: the need to develop both theory and resources. Integr Comp Biol 49(5):475–479

    Article  Google Scholar 

  • Hamer DH (2000) Beware the chopsticks gene. Mol Psychiatry 5(1):11–13

    Article  Google Scholar 

  • Hannenhalli S, Pevzner PA (1995) Towards a computational theory of genome rearrangements. Springer, Heidelberg, pp 184–202. https://doi.org/10.1007/BFb0015244

    Book  Google Scholar 

  • Hannenhalli S, Pevzner PA (1999) Transforming cabbage into turnip: Polynomial algorithm for sorting signed permutations by reversals. J ACM 46(1):1–27. https://doi.org/10.1145/300515.300516

    Article  MathSciNet  MATH  Google Scholar 

  • Hartmann T, Middendorf M, Bernt M (2018) Genome rearrangement analysis: cut and join genome rearrangements and gene cluster preserving approaches. Springer, New York, pp 261–289. https://doi.org/10.1007/978-1-4939-7463-4_9

    Book  Google Scholar 

  • Heled J, Drummond AJ (2010) Bayesian inference of species trees from multilocus data. Mol Biol Evol 27(3):570–580

    Article  Google Scholar 

  • Helgason A, Yngvadottir B, Hrafnkelsson B, Gulcher J, Stefánsson K (2005) An icelandic example of the impact of population structure on association studies. Nat Genet 37(1):90–95

    Article  Google Scholar 

  • Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6(2):95–108

    Article  Google Scholar 

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Hofmeyr SA, Forrest S (2000) Architecture for an artificial immune system. Evolut Comput 8(4):443–473

    Article  Google Scholar 

  • Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press, Cambridge

    Book  Google Scholar 

  • Holland J, Goldberg D (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Massachusetts

    Google Scholar 

  • Hu F, Zhou J, Zhou L, Tang J (2014) Probabilistic reconstruction of ancestral gene orders with insertions and deletions. IEEE/ACM Trans Comput Biol Bioinform 11(4):667–672

    Article  Google Scholar 

  • Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A et al (2003) The systems biology markup language (sbml): a medium for representation and exchange of biochemical network models. Bioinformatics 19(4):524–531

    Article  Google Scholar 

  • Hucka M, Bergmann FT, Hoops S, Keating SM, Sahle S, Schaff JC, Smith LP, Wilkinson DJ (2015) The systems biology markup language (sbml): language specification for level 3 version 1 core. J Integr Bioinform 12(2):382–549

    Article  Google Scholar 

  • Hudson RR (1983) Testing the constant-rate neutral allele model with protein sequence data. Evolution 37(1):203–217

    Article  Google Scholar 

  • Huelsenbeck JP, Ronquist F (2001) Mrbayes: Bayesian inference of phylogenetic trees. Bioinformatics 17(8):754–755

    Article  Google Scholar 

  • Hunt M, Newbold C, Berriman M, Otto TD (2014) A comprehensive evaluation of assembly scaffolding tools. Genome Biol 15(3):1–15

    Article  Google Scholar 

  • Idury RM, Waterman MS (1995) A new algorithm for DNA sequence assembly. J Comput Biol 2(2):291–306

    Article  Google Scholar 

  • Islam M, Sarker K, Das T, Reaz R, Bayzid MS (2020) Stelar: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency. BMC Genomics 21(1):1–13

    Article  Google Scholar 

  • Jain M, Fiddes IT, Miga KH, Olsen HE, Paten B, Akeson M (2015) Improved data analysis for the minion nanopore sequencer. Nat Methods 12(4):351

    Article  Google Scholar 

  • Jain M, Olsen HE, Paten B, Akeson M (2016) The oxford nanopore minion: delivery of nanopore sequencing to the genomics community. Genome Biol 17(1):239. https://doi.org/10.1186/s13059-016-1103-0

    Article  Google Scholar 

  • Janeway CA Jr (1992) The immune system evolved to discriminate infectious nonself from noninfectious self. Immunol Today 13(1):11–16

    Article  Google Scholar 

  • Ji Z, Dasgupta D (2007) Revisiting negative selection algorithms. Evolut Comput 15(2):223–251

    Article  Google Scholar 

  • Jiménez J, Skalic M, Martinez-Rosell G, De Fabritiis G (2018) K deep: protein-ligand absolute binding affinity prediction via 3d-convolutional neural networks. J Chem Inf Modeling 58(2):287–296

    Article  Google Scholar 

  • Jo T, Hou J, Eickholt J, Cheng J (2015) Improving protein fold recognition by deep learning networks. Sci Rep 5:17573

    Article  Google Scholar 

  • Jones NC, Pevzner PA, Pevzner P (2004) An introduction to bioinformatics algorithms. MIT press, Cambridge

    Google Scholar 

  • Jones BR, Rajaraman A, Tannier E, Chauve C (2012) Anges: reconstructing ancestral genomes maps. Bioinformatics 28(18):2388–2390

    Article  Google Scholar 

  • Kamath GM, Shomorony I, Xia F, Courtade TA, David NT (2017) HINGE: long-read assembly achieves optimal repeat resolution. Genome Res 27(5):747–756

    Article  Google Scholar 

  • Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E, et al. (2010) Variance component model to account for sample structure in genome-wide association studies. Nature genetics, 42(4): 348-354

  • Karaboga D, Basturk B (2008) On the performance of artificial bee colony (abc) algorithm. Appl Soft Comput 8(1):687–697

    Article  Google Scholar 

  • Karas M, Hillenkamp F (1988) Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Anal Chem 60(20):2299–2301

    Article  Google Scholar 

  • Karas M, Bachmann D, Hillenkamp F (1985) Influence of the wavelength in high-irradiance ultraviolet laser desorption mass spectrometry of organic molecules. Anal Chem 57(14):2935–2939

    Article  Google Scholar 

  • Karas M, Bachmann D, Bahr U, Hillenkamp F (1987) Matrix-assisted ultraviolet laser desorption of non-volatile compounds. Int J Mass Spectrom Ion Process 78:53–68

    Article  Google Scholar 

  • Katoh K, Misawa K, Kuma KI, Miyata T (2002) Mafft: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucl Acids Res 30(14):3059–3066

    Article  Google Scholar 

  • Kececioglu JD, Myers EW (1995) Combinatorial algorithms for dna sequence assembly. Algorithmica 13(1–2):7

    Article  MathSciNet  MATH  Google Scholar 

  • Khan WA, Hamadneh NN, Tilahun SL, Ngnotchouye J (2016) A review and comparative study of firefly algorithm and its modified versions. Optimization Algorithms-Methods and Applications pp. 281–313

  • Kim J, Larkin DM, Cai Q, Zhang Y, Ge RL, Auvil L, Capitanu B, Zhang G, Lewin HA, Ma J et al (2013) Reference-assisted chromosome assembly. Proc Natl Acad Sci 110(5):1785–1790

    Article  MathSciNet  Google Scholar 

  • Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114

  • Kircher M, Kelso J (2010) High-throughput DNA sequencing—concepts and limitations. BioEssays 32(6):524–536. https://onlinelibrary.wiley.com/doi/abs/10.1002/bies.200900181

    Article  Google Scholar 

  • Kohn M, Högel J, Vogel W, Minich P, Kehrer-Sawatzki H, Graves JA, Hameister H (2006) Reconstruction of a 450-my-old ancestral vertebrate protokaryotype. TRENDS Genet 22(4):203–210

    Article  Google Scholar 

  • Kolmogorov M, Raney B, Paten B, Pham S (2014) Ragout-a reference-assisted assembly tool for bacterial genomes. Bioinformatics 30(12):i302–i309. https://doi.org/10.1093/bioinformatics/btu280

    Article  Google Scholar 

  • Koren S, Treangen TJ, Pop M (2011) Bambus 2: scaffolding metagenomes. Bioinformatics 27(21):2964–2971

    Article  Google Scholar 

  • Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27(5):722–736

    Article  Google Scholar 

  • Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SD (2006) Gard: a genetic algorithm for recombination detection. Bioinformatics 22(24):3096–3098

    Article  Google Scholar 

  • Koza JR (1992) Genetic programming II. Automatic discovery of reusable subprograms. MIT Press, Cambridge

    Google Scholar 

  • Krause J, Cordeiro J, Parpinelli RS, Lopes HS (2013) A survey of swarm algorithms applied to discrete optimization problems. In: Swarm intelligence and bio-inspired computation. Elsevier, pp 169–191. https://www.sciencedirect.com/science/article/pii/B9780124051638000077

  • Kubatko LS, Carstens BC, Knowles LL (2009) STEM: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics 25(7):971–973. https://www.ncbi.nlm.nih.gov/pubmed/19211573

    Article  Google Scholar 

  • Kuleshov V, Snyder MP, Batzoglou S (2016) Genome assembly from synthetic long read clouds. Bioinformatics 32(12):i216–i224

    Article  Google Scholar 

  • Kulkarni TD, Whitney WF, Kohli P, Tenenbaum J (2015) Deep convolutional inverse graphics network. In: Advances in neural information processing systems, vol 28. pp. 2539–2547. https://papers.nips.cc/paper/2015/hash/ced556cd9f9c0c8315cfbe0744a3baf0-Abstract.html

  • Kumar S, Tamura K, Nei M (1994) Mega: molecular evolutionary genetics analysis software for microcomputers. Bioinformatics 10(2):189–191

    Article  Google Scholar 

  • Lam KK, LaButti K, Khalak A, Tse D (2015) FinisherSC: a repeat-aware tool for upgrading de novo assembly using long reads. Bioinformatics 31(19):3207–3209

    Article  Google Scholar 

  • Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265(5181):2037–2048

    Article  Google Scholar 

  • Lander ES, Waterman MS (1988) Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics 2(3):231–239

    Article  Google Scholar 

  • Larget B, Kotha SK, Dewey CN, Ané C (2010) BUCKy: Gene tree/species tree reconciliation with the Bayesian concordance analysis. Bioinformatics 26(22):2910–2911

    Article  Google Scholar 

  • Lassmann T, Frings O, Sonnhammer EL (2008) Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features. Nucl Acids Res 37(3):858–865

    Article  Google Scholar 

  • Laver T, Harrison J, O’neill, P., Moore, K., Farbos, A., Paszkiewicz, K., Studholme, D.J. (2015) Assessing the performance of the oxford nanopore technologies minion. Biomol Detect Quantif 3:1–8

    Article  Google Scholar 

  • Leaché AD, Rannala B (2011) The accuracy of species tree estimation under simulation: a comparison of methods. Systematic Biol 60(2):126–137

    Article  Google Scholar 

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  • Lee H, Gurtowski J, Yoo S, Marcus S, McCombie, WR, Schatz M (2014) Error correction and assembly complexity of single molecule sequencing reads. BioRxiv, 006395

  • Lewis PO (1998) A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol Biol Evol 15(3):277–283

    Article  Google Scholar 

  • Li H (2012) Exploring single-sample snp and indel calling with whole-genome de novo assembly. Bioinformatics 28(14):1838–1844

    Article  Google Scholar 

  • Li H (2016) Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32(14):2103–2110

    Article  Google Scholar 

  • Lin Y, Moret BM (2008) Estimating true evolutionary distances under the DCJ model. Bioinformatics 24(13):i114–i122. https://doi.org/10.1093/bioinformatics/btn148

    Article  Google Scholar 

  • Lin DY, Tao R, Kalsbeek WD, Zeng D, Gonzalez F II, Fernández-Rhodes L, Graff M, Koch GG, North KE, Heiss G (2014) Genetic association analysis under complex survey sampling: the hispanic community health study/study of latinos. Am J Human Genet 95(6):675–688

    Article  Google Scholar 

  • Lin Y, Nurk S, Pevzner PA (2014) What is the difference between the breakpoint graph and the de Bruijn graph? BMC Genom 15(6):S6

    Article  Google Scholar 

  • Lin Y, Yuan J, Kolmogorov M, Shen MW, Chaisson M, Pevzner PA (2016) Assembly of long error-prone reads using de Bruijn graphs. Proc Natl Acad Sci 113(52):E8396–E8405

    Article  Google Scholar 

  • Linder CR, Warnow T (2001) An overview of phylogeny reconstruction. Citeseer. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.721.9318

  • Liu L (2008) BEST: Bayesian estimation of species trees under the coalescent model. Bioinformatics 24(21):2542–2543

    Article  Google Scholar 

  • Liu L, Yu L (2011) Estimating species trees from unrooted gene trees. Systematic Biol 60(5):661–667. https://doi.org/10.1093/sysbio/syr027

    Article  Google Scholar 

  • Liu K, Raghavan S, Nelesen S, Linder CR, Warnow T (2009) Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science 324(5934):1561–1564

    Article  Google Scholar 

  • Liu L, Yu L, Pearl DK, Edwards SV (2009) Estimating species phylogenies using coalescence times among sequences. Systematic Biol 58(5):468–477

    Article  Google Scholar 

  • Liu L, Yu L, Edwards SV (2010) A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evolut Biol 10(1):302

    Article  Google Scholar 

  • Liu Y, Ye Q, Wang L, Peng J (2018) Learning structural motif representations for efficient protein structure search. Bioinformatics 34(17):i773–i780

    Article  Google Scholar 

  • Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M (2012) Comparison of next-generation sequencing systems. BioMed Res Int 2012:251364. https://doi.org/10.1155/2012/251364

  • Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN (2003) Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nature Genet 33(2):177–182

    Article  Google Scholar 

  • Loman NJ, Quick J, Simpson JT (2015) A complete bacterial genome assembled de novo using only nanopore sequencing data. Nature Methods 12(8):733

    Article  Google Scholar 

  • Lones MA (2014) Metaheuristics in nature-inspired algorithms. In: Proceedings of the companion publication of the 2014 annual conference on genetic and evolutionary computation. ACM, pp 1419–1422

  • Löytynoja A, Goldman N (2005) An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci USA 102(30):10557–10562

    Article  Google Scholar 

  • Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, Tang J, Wu G, Zhang H, Shi Y, Liu Y, Yu C, Wang B, Lu Y, Han C, Cheung DW, Yiu SM, Peng S, Xiaoqian Z, Liu G, Liao X, Li Y, Yang H, Wang J, Lam TW, Wang J (2012) Soapdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1(1):18. https://doi.org/10.1186/2047-217X-1-18

    Article  Google Scholar 

  • Ma J (2010) A probabilistic framework for inferring ancestral genomic orders. In: Bioinformatics and Biomedicine (BIBM). In: 2010 IEEE international conference on, pp 179–184. IEEE

  • Ma J, Zhang L, Suh BB, Raney BJ, Burhans RC, Kent WJ, Blanchette M, Haussler D, Miller W (2006) Reconstructing contiguous regions of an ancestral genome. Genome Res 16(11):1557–1565

    Article  Google Scholar 

  • Maddison WP (1997) Gene trees in species trees. Systematic Biol 46(3):523–536

    Article  Google Scholar 

  • Madoui MA, Engelen S, Cruaud C, Belser C, Bertrand L, Alberti A, Lemainque A, Wincker P, Aury JM (2015) Genome assembly using nanopore-guided long and error-free DNA reads. BMC Genom 16(1):327

    Article  Google Scholar 

  • Madoui MA, Dossat C, d’Agata L, van Oeveren J, van der Vossen E, Aury JM (2016) MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome \(\text{ Profiling}^{{\rm TM}}\) Data. BMC Bioinform 17:115

    Article  Google Scholar 

  • Mägi R, Morris AP (2010) Gwama: software for genome-wide association meta-analysis. BMC Bioinform 11(1):288

    Article  Google Scholar 

  • Maier D (1978) The complexity of some problems on subsequences and supersequences. JACM 25(2):322–336

    Article  MathSciNet  MATH  Google Scholar 

  • Makarenkov V, Kevorkov D, Legendre P (2006) Phylogenetic network construction approaches. In: Applied mycology and biotechnology, vol. 6. Elsevier, pp 61–97. https://www.sciencedirect.com/science/article/abs/pii/S1874533406800067

  • Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36(5):512–517

    Article  Google Scholar 

  • Mardis ER (2013) Next-generation sequencing platforms. Annu Rev Anal Chem 6:287–303

    Article  Google Scholar 

  • Mardis ER (2017) DNA sequencing technologies: 2006–2016. Nat Protoc 12(2):213

    Article  Google Scholar 

  • Marees AT, de Kluiver H, Stringer S, Vorspan F, Curis E, Marie-Claire C, Derks EM (2018) A tutorial on conducting genome-wide association studies: Quality control and statistical analysis. Int J Methods Psychiatr Res 27(2):e1608

    Article  Google Scholar 

  • Matzinger P (2001) Essay 1: the danger model in its historical context. Scand J Immunol 54(1–2):4–9

    Article  Google Scholar 

  • McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9(5):356–369

    Article  Google Scholar 

  • Medvedev P (2019) Modeling biological problems in computer science: a case study in genome assembly. Brief Bioinform 20(4):1376–1383

    Article  MathSciNet  Google Scholar 

  • Medvedev P, Georgiou K, Myers G, Brudno M (2007) Computability of models for sequence assembly. In: International workshop on algorithms in bioinformatics, pp. 289–301. Springer

  • Melsted P, Pritchard JK (2011) Efficient counting of k-mers in DNA sequences using a bloom filter. BMC Bioinform 12(1):333. https://doi.org/10.1186/1471-2105-12-333

    Article  Google Scholar 

  • Mendelowitz L, Pop M (2014) Computational methods for optical mapping. GigaScience 3(1):33

    Article  Google Scholar 

  • Metzker ML (2010) Sequencing technologies-the next generation. Nat Rev Genet 11(1):31

    Article  Google Scholar 

  • Metzker ML, Mindell DP, Liu XM, Ptak RG, Gibbs RA, Hillis DM (2002) Molecular evidence of HIV-1 transmission in a criminal case. Proc Natl Acad Sci 99(22):14292–14297

    Article  Google Scholar 

  • Meyer-Nieberg S, Beyer HG (2007) Self-adaptation in evolutionary algorithms. In: Parameter setting in evolutionary algorithms. Springer, pp. 47–75. https://homepages.fhv.at/hgb/New-Papers/self-adaptation.pdf

  • Miclotte G, Heydari M, Demeester P, Rombauts S, Van de Peer Y, Audenaert P, Fostier J (2016) Jabba: hybrid error correction for long sequencing reads. Algorithms Mol Biol 11(1):10

    Article  Google Scholar 

  • Miller JR, Koren S, Sutton G (2010) Assembly algorithms for next-generation sequencing data. Genomics 95(6):315–327

    Article  Google Scholar 

  • Minkin I, Pham S, Medvedev P (2016) Twopaco: an efficient algorithm to build the compacted de bruijn graph from many complete genomes. Bioinformatics 33(24):4024–4032

    Google Scholar 

  • Minkin I, Patel A, Kolmogorov M, Vyahhi N, Pham S (2013) Sibelia: a scalable and comprehensive synteny block generation tool for closely related microbial genomes. In: International workshop on algorithms in bioinformatics. Springer, pp. 215–229

  • Mirarab S, Reaz R, Bayzid MS, Zimmermann T, Swenson MS, Warnow T (2014) ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics 30(17):i541–i548

    Article  Google Scholar 

  • Mirarab S, Nguyen N, Guo S, Wang LS, Kim J, Warnow T (2015) Pasta: ultra-large multiple sequence alignment for nucleotide and amino-acid sequences. J Comput Biol 22(5):377–386

    Article  Google Scholar 

  • Mirkin B, Muchnik I, Smith T (1995) A biologically consistent model for comparing molecular phylogenies. J Comput Biol 2(4):493–507

    Article  Google Scholar 

  • Mittal S, Nirwal N, Sardana H (2014) Enhanced artificial bees colony algorithm for traveling salesman problem. J Adv Comput Commun Technol 2(2):2347–2804

    Google Scholar 

  • Mossel E, Roch S (2011) Incomplete lineage sorting: consistent phylogeny estimation from multiple loci. IEEE/ACM Trans Comput Biol Bioinform 7(1):166–171

    Article  Google Scholar 

  • Muñoz A, Zheng C, Zhu Q, Albert VA, Rounsley S, Sankoff D (2010) Scaffold filling, contig fusion and comparative gene order inference. BMC Bioinform 11(1):304

    Article  Google Scholar 

  • Myers EW (1995) Toward simplifying and accurately formulating fragment assembly. J Comput Biol 2(2):275–290

    Article  Google Scholar 

  • Myers EW (2005) The fragment assembly string graph. Bioinformatics 21(suppl-2):ii79–ii85

    Google Scholar 

  • Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, Kravitz SA, Mobarry CM, Reinert KH, Remington KA et al (2000) A whole-genome assembly of drosophila. Science 287(5461):2196–2204

    Article  Google Scholar 

  • Nagarajan N, Pop M (2009) Parametric complexity of sequence assembly: theory and applications to next generation sequencing. J Comput Biol 16(7):897–908

    Article  MathSciNet  Google Scholar 

  • Nagarajan N, Pop M (2013) Sequence assembly demystified. Nat Rev Genet 14(3):157

    Article  Google Scholar 

  • Nagarajan N, Read TD, Pop M (2008) Scaffolding and validation of bacterial genome assemblies using optical restriction maps. Bioinformatics 24(10):1229–1235

    Article  Google Scholar 

  • Nakatani Y, Takeda H, Kohara Y, Morishita S (2007) Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Res 17(9):1254–1265

    Article  Google Scholar 

  • Nakhleh L, Sun J, Warnow T, Linder CR, Moret BM, Tholse A (2002) Towards the development of computational tools for evaluating phylogenetic network reconstruction methods. In: Biocomputing 2003. World Scientific, pp. 315–326

  • Navlakha S, Bar-Joseph Z (2011) Algorithms in nature: the convergence of systems biology and computational thinking. Mol Syst Biol 7(1):546

    Article  Google Scholar 

  • Nayeem MA, Bayzid MS, Rahman AH, Shahriyar R, Rahman MS (2019) A’phylogeny-aware’multi-objective optimization approach for computing msa. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 577–585

  • Neafsey DE, Waterhouse RM, Abai MR, Aganezov SS, Alekseyev MA, Allen JE, Amon J, Arcà B, Arensburger P, Artemov G et al (2015) Highly evolvable malaria vectors: the genomes of 16 anopheles mosquitoes. Science 347(6217):1258522

    Article  Google Scholar 

  • Nei M (1986) Stochastic errors in DNA evolution and molecular phylogeny. Prog Clin Biol Res 218:133–147

    Google Scholar 

  • Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York

    Book  Google Scholar 

  • Nguyen N, Mirarab S, Warnow T (2012) MRL and SuperFine+MRL: new supertree methods. Algorithms Mol Biol 7:3

    Article  Google Scholar 

  • Ning Z, Cox AJ, Mullikin JC (2001) SSAHA: a fast search method for large DNA databases. Genome Res 11(10):1725–1729

    Article  Google Scholar 

  • Notredame C, Higgins DG (1996) SAGA: sequence alignment by genetic algorithm. Nucl Acids Res 24(8):1515–1524

    Article  Google Scholar 

  • Notredame C, O’Brien EA, Higgins DG (1997) RAGA: RNA sequence alignment by genetic algorithm. Nucl Acids Res 25(22):4570–4580

    Article  Google Scholar 

  • Notredame C, Higgins DG, Heringa J (2000) T-coffee: a novel method for fast and accurate multiple sequence alignment1. J Mol Biol 302(1):205–217

    Article  Google Scholar 

  • Nurse P (2008) Life, logic and information. Nature 454(7203):424

    Article  Google Scholar 

  • O’Connor RE, Romanov MN, Kiazim LG, Barrett PM, Farré M, Damas J, Ferguson-Smith M, Valenzuela N, Larkin DM, Griffin DK (2018) Reconstruction of the diapsid ancestral genome permits chromosome evolution tracing in avian and non-avian dinosaurs. Nat Commun 9(1):1883

    Article  Google Scholar 

  • Page RD (1993) Genes, organisms, and areas: the problem of multiple lineages. Systematic Biol 42(1):77–84

    Article  Google Scholar 

  • Page R (1998) Genetree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14(9):819–820

    Article  Google Scholar 

  • Page R, Charleston M (1997) From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Mol Phylogentics Evol 7(2):231–240

    Article  Google Scholar 

  • Page R, Charleston M (1997) Reconciled trees and incongruent gene and species trees. Math Hierarchies Biol 37:57–70

    Article  MathSciNet  MATH  Google Scholar 

  • Palmer JD, Herbon LA (1988) Plant mitochondrial DNA evolved rapidly in structure, but slowly in sequence. J Mol Evol 28(1):87–97. https://doi.org/10.1007/BF02143500

    Article  Google Scholar 

  • Park Y, Kellis M (2015) Deep learning for regulatory genomics. Nat Biotechnol 33(8):825

    Article  Google Scholar 

  • Patané JSL, Martins J, Setubal JC (2018) Phylogenomics. Springer, New York, pp 103–187. https://doi.org/10.1007/978-1-4939-7463-4_5

    Book  Google Scholar 

  • Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2(12):e190

    Article  Google Scholar 

  • Patterson M, Szöllősi G, Daubin V, Tannier E (2013) Lateral gene transfer, rearrangement, reconciliation. In: BMC bioinformatics, vol. 14. BioMed Central, p S4. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-S15-S4

  • Pavlidis P, Alachiotis N (2017) A survey of methods and tools to detect recent and strong positive selection. J Biol Res Thessalon 24(1):1–17

    Article  Google Scholar 

  • Pe’er I, Yelensky R, Altshuler D, Daly MJ (2008) Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol Off Publ Int Genet Epidemiol Soc 32(4):381–385

    Google Scholar 

  • Perrin A, Varré JS, Blanquart S, Ouangraoua A (2015) Procars: progressive reconstruction of ancestral gene orders. BMC Genomics 16(5):S6

    Article  Google Scholar 

  • Pevzner PA (1989) 1-tuple DNA sequencing: computer analysis. J Biomol Struct Dyn 7(1):63–73

    Article  Google Scholar 

  • Pevzner PA, Tang H (2001) Fragment assembly with double-barreled data. Bioinformatics 17(suppl-1):S225–S233

    Article  Google Scholar 

  • Pevzner PA, Tang H, Waterman MS (2001) An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci 98(17):9748–9753

    Article  MathSciNet  MATH  Google Scholar 

  • Philippe N, Legendre M, Doutre G, Couté Y, Poirot O, Lescot M, Arslan D, Seltzer V, Bertaux L, Bruley C et al (2013) Pandoraviruses: amoeba viruses with genomes up to 2.5 mb reaching that of parasitic eukaryotes. Science 341(6143):281–286

    Article  Google Scholar 

  • Pop M, Kosack DS, Salzberg SL (2004) Hierarchical scaffolding with Bambus. Genome Res 14(1):149–159

    Article  Google Scholar 

  • Popescu P, Hayes H (2000) Techniques in animal cytogenetics. Springer Science & Business Media, Berlin

    Book  Google Scholar 

  • Poplin R, Chang PC, Alexander D, Schwartz S, Colthurst T, Ku A, Newburger D, Dijamco J, Nguyen N, Afshar PT et al (2018) A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol 36(10):983–987

    Article  Google Scholar 

  • Poultney C, Chopra S, Cun YL et al (2007) Efficient learning of sparse representations with an energy-based model. In: Advances in neural information processing systems. pp 1137–1144. https://papers.nips.cc/paper/2006/file/87f4d79e36d68c3031ccf6c55e9bbd39-Paper.pdf

  • Priami C (2009) Algorithmic systems biology. Commun ACM 52(5):80–88

    Article  Google Scholar 

  • Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8):904–909

    Article  Google Scholar 

  • Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, De Bakker PI, Daly MJ et al (2007) Plink: a tool set for whole-genome association and population-based linkage analyses. Am J Human Genet 81(3):559–575

    Article  Google Scholar 

  • Putnam NH, Butts T, Ferrier DE, Furlong RF, Hellsten U, Kawashima T, Robinson-Rechavi M, Shoguchi E, Terry A, Yu JK et al (2008) The amphioxus genome and the evolution of the chordate karyotype. Nature 453(7198):1064

    Article  Google Scholar 

  • Putnam NH, O’Connell BL, Stites JC, Rice BJ, Blanchette M, Calef R, Troll CJ, Fields A, Hartley PD, Sugnet CW et al (2016) Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. Genome Res 26(3):342–350

    Article  Google Scholar 

  • Quijano N, Passino KM (2007) Honey bee social foraging algorithms for resource allocation, part I: Algorithm and theory. In: American control conference, 2007. ACC’07, IEEE, pp. 3383–3388

  • Räihä KJ, Ukkonen E (1981) The shortest common supersequence problem over binary alphabet is np-complete. Theor Comput Sci 16(2):187–198

    Article  MathSciNet  MATH  Google Scholar 

  • Rechenberg I (1981) Evolutionsstrategie-optimierung technischer systems nach prinzipien der biologischen evolution, stuttgart: frommannholzboog, 1973. Wiley, New York

    Google Scholar 

  • Richter DC, Schuster SC, Huson DH (2007) Oslay: optimal syntenic layout of unfinished assemblies. Bioinformatics 23(13):1573–1579

    Article  Google Scholar 

  • Rissman AI, Mau B, Biehl BS, Darling AE, Glasner JD, Perna NT (2009) Reordering contigs of draft genomes using the mauve aligner. Bioinformatics 25(16):2071–2073

    Article  Google Scholar 

  • Roberts RJ, Carneiro MO, Schatz MC (2013) The advantages of SMRT sequencing. Genome Biol 14(6):405. https://doi.org/10.1186/gb-2013-14-6-405

    Article  Google Scholar 

  • Rosen CB, Rodriguez-Larrea D, Bayley H (2014) Single-molecule site-specific detection of protein phosphorylation with a nanopore. Nat Biotechnol 32(2):179

    Article  Google Scholar 

  • Rosenberg N (2002) The probability of topological concordance of gene trees and species trees. Theor Popul Biol 61(2):225–247. https://doi.org/10.1006/tpbi.2001.1568

    Article  MATH  Google Scholar 

  • Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386

    Article  Google Scholar 

  • Rubinstein A, Chor B (2014) Computational thinking in life science education. PLoS Comput Biol 10(11):e1003897

    Article  Google Scholar 

  • Salmela L, Rivals E (2014) Lordec: accurate and efficient long read error correction. Bioinformatics 30(24):3506–3514

    Article  Google Scholar 

  • Salse J (2016) Ancestors of modern plant crops. Current Opinion in Plant Biology 30:134 – 142. https://doi.org/10.1016/j.pbi.2016.02.005. https://www.sciencedirect.com/science/article/pii/S136952661630022X. SI: 30: Genome studies and molecular genetics

  • Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci 74(12):5463–5467

    Article  Google Scholar 

  • Sankoff D, Nadeau JH (2000) Comparative genomics. Springer, Dordrecht, pp 3–7. https://doi.org/10.1007/978-94-011-4309-7_1

    Book  Google Scholar 

  • Schalkoff RJ (1997) Artificial neural networks. McGraw-Hill, New York

    MATH  Google Scholar 

  • Schatz MC, Delcher AL, Salzberg SL (2010) Assembly of large genomes using second-generation sequencing. Genome Res 20(9):1165–1173

    Article  Google Scholar 

  • Schirmer M, Ijaz UZ, D’Amore R, Hall N, Sloan WT, Quince C (2015) Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform. Nucl Acids Res 43(6):e37–e37

    Article  Google Scholar 

  • Secker A, Freitas AA, Timmis J (2003) A danger theory inspired approach to web mining. In: International conference on artificial immune systems. Springer, pp. 156–167

  • Sedlazeck FJ, Lee H, Darby CA, Schatz MC (2018) Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet 19(6):329–346

    Article  Google Scholar 

  • Seeley TD, Visscher PK, Passino KM (2006) Group decision making in honey bee swarms: when 10,000 bees go house hunting, how do they cooperatively choose their new nesting site? Am Sci 94(3):220–229

    Article  Google Scholar 

  • Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J et al (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol Syst Biol 7(1):539

    Article  Google Scholar 

  • Simpson PK (1997) Neural networks applications. IEEE Press, New Jersey

    Google Scholar 

  • Simpson JT (2014) Exploring genome characteristics and sequence quality without a reference. Bioinformatics 30(9):1228–1235

    Article  Google Scholar 

  • Simpson JT, Durbin R (2010) Efficient construction of an assembly string graph using the FM-index. Bioinformatics 26(12):i367–i373

    Article  Google Scholar 

  • Simpson JT, Durbin R (2012) Efficient de novo assembly of large genomes using compressed data structures. Genome Res 22(3):549–556

    Article  Google Scholar 

  • Simpson JT, Pop M (2015) The theory and practice of genome sequence assembly. Annu Rev Genomics Human Genet 16:153–172

    Article  Google Scholar 

  • Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19(6):1117–1123

    Article  Google Scholar 

  • Simpson JT, Workman RE, Zuzarte P, David M, Dursi L, Timp W (2017) Detecting DNA cytosine methylation using nanopore sequencing. Nat Methods 14(4):407–410. https://doi.org/10.1038/nmeth.4184

    Article  Google Scholar 

  • Simpson JT, Workman RE, Zuzarte P, David M, Dursi L, Timp W (2017) Detecting DNA cytosine methylation using nanopore sequencing. Nat Methods 14(4):407

    Article  Google Scholar 

  • Slatkin M (2008) Linkage disequilibrium-understanding the evolutionary past and mapping the medical future. Nat Rev Genet 9(6):477–485

    Article  Google Scholar 

  • Soderlund C, Bomhoff M, Nelson WM (2011) Symap v3. 4: a turnkey synteny system with application to plant genomes. Nucl Acids Res 39(10):e68–e68

    Article  Google Scholar 

  • Sohn JI, Nam JW (2016) The present and future of de novo whole-genome assembly. Brief Bioinform 19(1):23–40

    Google Scholar 

  • Spencer M, Eickholt J, Cheng J (2015) A deep learning network approach to ab initio protein secondary structure prediction. IEEE/ACM Trans Comput Biol Bioinform 12(1):103–112

    Article  Google Scholar 

  • Stamatakis A (2005) An efficient program for phylogenetic inference using simulated annealing. In: 19th IEEE international parallel and distributed processing symposium. IEEE, pp. 8–pp

  • Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21):2688–2690

    Article  Google Scholar 

  • Stamatakis A, Ludwig T, Meier H (2005) RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4):456–463

    Article  Google Scholar 

  • Stephens PJ, Greenman CD, Fu B, Yang F, Bignell GR, Mudie LJ, Pleasance ED, Lau KW, Beare D, Stebbings LA, McLaren S, Lin ML, McBride DJ, Varela I, Nik-Zainal S, Leroy C, Jia M, Menzies A, Butler AP, Teague JW, Quail MA, Burton J, Swerdlow H, Carter NP, Morsberger LA, Iacobuzio-Donahue C, Follows GA, Green AR, Flanagan AM, Stratton MR, Futreal PA, Campbell PJ (2011) Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144(1):27–40. https://www.sciencedirect.com/science/article/pii/S0092867410013772

    Article  Google Scholar 

  • Stoye J, Wittler R (2009) A unified approach for reconstructing ancient gene clusters. IEEE/ACM Trans Comput Biol Bioinform 6(3):387–400

    Article  Google Scholar 

  • Sturtevant AH, Dobzhansky T (1936) Inversions in the third chromosome of wild races of drosophila pseudoobscura, and their use in the study of the history of the species. Proc Natl Acad Sci 22(7):448–450. https://www.pnas.org/content/22/7/448

    Article  Google Scholar 

  • Sturtevant AH, Novitski E (1941) The homologies of the chromosome elements in the genus drosophila. Genetics 26(5):517

    Article  Google Scholar 

  • Swenson KM, Blanchette M (2015) Models and algorithms for genome rearrangement with positional constraints. In: Pop M, Touzet H (eds) Algorithms Bioinform. Springer, Heidelberg, pp 243–256

    Chapter  MATH  Google Scholar 

  • Szabó A, Novák Á, Miklós I, Hein J (2010) Reticular alignment: a progressive corner-cutting method for multiple sequence alignment. BMC Bioinform 11(1):570

    Article  Google Scholar 

  • Tajima F (1983) Evolutionary relationship of DNA sequences in finite populations. Genetics 105(2):437–460. https://www.genetics.org/cgi/content/abstract/105/2/437

  • Takahata N (1989) Gene geneaology in three related populations: consistency probability between gene and population trees. Genetics 122(4):957–966

    Article  Google Scholar 

  • Talbi EG (2009) Metaheuristics: from design to implementation. Wiley, New York

    Book  MATH  Google Scholar 

  • Tamazian G, Dobrynin P, Krasheninnikova K, Komissarov A, Koepfli KP, O’Brien SJ (2016) Chromosomer: a reference-based genome arrangement tool for producing draft chromosome sequences. GigaScience 5(1):38. https://doi.org/10.1186/s13742-016-0141-6

    Article  Google Scholar 

  • Tamura K, Dudley J, Nei M, Kumar S (2007) Mega4: molecular evolutionary genetics analysis (mega) software version 4.0. Mol Biol Evolution 24(8):1596–1599

    Article  Google Scholar 

  • Tang H, Zhang X, Miao C, Zhang J, Ming R, Schnable JC, Schnable PS, Lyons E, Lu J (2015) ALLMAPS: robust scaffold ordering based on multiple maps. Genome Biol 16:3

    Article  Google Scholar 

  • Tannier E, Zheng C, Sankoff D (2009) Multichromosomal median and halving problems under different genomic distances. BMC Bioinform 10:120

    Article  Google Scholar 

  • Tarhio J, Ukkonen E (1988) A greedy approximation algorithm for constructing shortest common superstrings. Theor Comput Sci 57(1):131–145

    Article  MathSciNet  MATH  Google Scholar 

  • Than CV, Nakhleh L (2009) Species tree inference by minimizing deep coalescences. PLoS Comput Biol 5(9):e1000501

    Article  MathSciNet  Google Scholar 

  • The cancer genome atlas program: https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga. Accessed: September 10, 2020

  • the human genome project: https://www.genome.gov/human-genome-project. Accessed: September 10, 2020

  • Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl Acids Res 22(22):4673–4680

    Article  Google Scholar 

  • Tilahun SL, Ngnotchouye JMT, Hamadneh NN (2019) Continuous versions of firefly algorithm: a review. Artif Intell Rev 51(3):445–492

    Article  Google Scholar 

  • Timp W, Nice AM, Nelson EM, Kurz V, McKelvey K, Timp G (2014) Think small: nanopores for sensing and synthesis. IEEE Access 2:1396–1408

    Article  Google Scholar 

  • Torrisi M, Pollastri G, Le Q (2020) Deep learning methods in protein structure prediction. Comput Struct Biotech J 18:1301–1310

  • Uddin MR, Mahbub S, Rahman MS, Bayzid MS (2020) SAINT: self-attention augmented inception-inside-inception network improves protein secondary structure prediction. Bioinformatics p. btaa531. https://doi.org/10.1093/bioinformatics/btaa531

  • Ulutas BH, Kulturel-Konak S (2011) A review of clonal selection algorithm and its applications. Artif Intell Rev 36(2):117–138

    Article  Google Scholar 

  • van Hijum SA, Zomer AL, Kuipers OP, Kok J (2005) Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies. Nucl Acids Res 33(suppl–2):W560–W566

    Article  Google Scholar 

  • Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA et al (2001) The sequence of the human genome. science 291(5507):1304–1351

    Article  Google Scholar 

  • Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning. ACM, pp 1096–1103

  • Vogel G (1998) HIV strain analysis debuts in murder trial. Science. https://www.sciencemag.org/news/1998/10/dna-strain-analysis-debuts-murder-trial

  • Voigt HM, Anheyer T (1994) Modal mutations in evolutionary algorithms. In: Evolutionary Computation, 1994. IEEE world congress on computational intelligence., proceedings of the first IEEE conference on, pp. 88–92. IEEE

  • Wajid B, Serpedin E (2012) Review of general algorithmic features for genome assemblers for next generation sequencers. Genomics Proteomics Bioinform 10(2):58–73

    Article  Google Scholar 

  • Wang WY, Barratt BJ, Clayton DG, Todd JA (2005) Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 6(2):109–118

    Article  Google Scholar 

  • Wang Y, Li W, Zhang T, Ding C, Lu Z, Long N, Rose JP, Wang BC, Lin D (2006) Reconstruction of ancient genome and gene order from complete microbial genome sequences. J Theor Biol 239(4):494–498

    Article  MathSciNet  Google Scholar 

  • Wang S, Peng J, Ma J, Xu J (2016) Protein secondary structure prediction using deep convolutional neural fields. Sci Rep 6(1):1–11

    Google Scholar 

  • Wang S, Jiang X, Tang H, Wang X, Bu D, Carey K, Dyke SO, Fox D, Jiang C, Lauter K et al (2017) A community effort to protect genomic data sharing, collaboration and outsourcing. NPJ Genomic Med 2(1):33

    Article  Google Scholar 

  • Wang GG, Gandomi AH, Alavi AH, Gong D (2019) A comprehensive review of krill herd algorithm: variants, hybrids and applications. Artif Intell Rev 51(1):119–148

    Article  Google Scholar 

  • Warren RL, Yang C, Vandervalk BP, Behsaz B, Lagman A, Jones SJ, Birol I (2015) LINKS: scalable, alignment-free scaffolding of draft genomes with long reads. GigaScience 4:35

    Article  Google Scholar 

  • Watanabe K, Taskesen E, Van Bochoven A, Posthuma D (2017) Functional mapping and annotation of genetic associations with FUMA. Nat Commun 8(1):1–11

    Article  Google Scholar 

  • Watterson G, Ewens W, Hall T, Morgan A (1982) The chromosome inversion problem. J Theor Biol 99(1):1–7. https://www.sciencedirect.com/science/article/pii/0022519382903848

    Article  Google Scholar 

  • Webb S (2018) Deep learning for biology. Nature 554(7693). https://go.gale.com/ps/anonymous?id=GALE%7CA528459891&sid=googleScholar&v=2.1&it=r&linkaccess=abs&issn=00280836&p=HRCA&sw=w

  • Weber JL, Myers EW (1997) Human whole-genome shotgun sequencing. Genome Res 7(5):401–409

    Article  Google Scholar 

  • Weinreb C, Oesper L, Raphael BJ (2014) Open adjacencies and k-breaks: detecting simultaneous rearrangements in cancer genomes. BMC Genomics 15(6):S4. https://doi.org/10.1186/1471-2164-15-S6-S4

    Article  Google Scholar 

  • Weisenfeld NI, Kumar V, Shah P, Church DM, Jaffe DB (2017) Direct determination of diploid genome sequences. Genome Res 27(5):757–767

    Article  Google Scholar 

  • Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L et al (2014) The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucl Acids Res 42(D1):D1001–D1006

    Article  Google Scholar 

  • Willer CJ, Li Y, Abecasis GR (2010) Metal: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26(17):2190–2191

    Article  Google Scholar 

  • Wu D, Bi S, Zhang L, Yang J (2014) Single-molecule study of proteins by biological nanopore sensors. Sensors 14(10):18211–18222

    Article  Google Scholar 

  • Xu AW, Moret BM (2011) Gasts: Parsimony scoring under rearrangements. In: international workshop on algorithms in bioinformatics. Springer pp 351–363

  • Yancopoulos S, Attie O, Friedberg R (2005) Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16):3340–3346

    Article  Google Scholar 

  • Yang XS (2010) A new metaheuristic bat-inspired algorithm. In: Nature inspired cooperative strategies for optimization (NICSO 2010). Springer, pp 65–74. https://link.springer.com/chapter/10.1007/978-3-642-12538-6_6

  • Yeo S, Coombe L, Warren RL, Chu J, Birol I (2017) Arcs: scaffolding genome drafts with linked reads. Bioinformatics 34(5):725–731

    Article  Google Scholar 

  • Yu Y, Warnow T, Nakhleh L (2011) Algorithms for mdc-based multi-locus phylogeny inference: beyond rooted binary gene trees on single alleles. J Comput Biol 18(11):1543–1559

    Article  MathSciNet  Google Scholar 

  • Zeiler MD, Krishnan D, Taylor GW, Fergus R (2010) Deconvolutional networks. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 2528–2535. IEEE

  • Zeira R, Shamir R (2018) Sorting cancer karyotypes using double-cut-and-joins, duplications and deletions. Bioinformatics p. bty381. https://doi.org/10.1093/bioinformatics/bty381

  • Zeira R, Shamir R (2019) Genome rearrangement problems with single and multiple gene copies: A review. In: Bioinformatics and Phylogenetics. Springer, pp 205–241. https://link.springer.com/chapter/10.1007/978-3-030-10837-3_10

  • Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18(5):821–829

    Article  Google Scholar 

  • Zhang L (2011) From gene trees to species trees II: species tree inference by minimizing deep coalescence events. IEEE/ACM Trans Comput Biol Bioinform 8(9):1685–1691

    Article  Google Scholar 

  • Zhang S, Zhou J, Hu H, Gong H, Chen L, Cheng C, Zeng J (2015) A deep learning framework for modeling structural features of RNA-binding protein targets. Nucl Acids Res 44(4):e32–e32

    Article  Google Scholar 

  • Zhao H, Bourque G (2007) Recovering true rearrangement events on phylogenetic trees. In: RECOMB international workshop on comparative genomics. Springer, pp. 149–161

  • Zheng C, Sankoff D (2011) On the pathgroups approach to rapid small phylogeny. BMC Bioinform 12(1):S4

    Article  Google Scholar 

  • Zheng GX, Lau BT, Schnall-Levin M, Jarosz M, Bell JM, Hindson CM, Kyriazopoulou-Panagiotopoulou S, Masquelier DA, Merrill L, Terry JM et al (2016) Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat Biotechnol 34(3):303–311

    Article  Google Scholar 

  • Zhou X, Stephens M (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44(7):821–824

    Article  Google Scholar 

  • Zhu Y, Tan Y (2011) A danger theory inspired learning model and its application to spam detection. In: International conference in swarm intelligence. Springer, pp 382–389

  • Zwickl DJ (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. thesis

Download references

Acknowledgements

This work is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 702527. Authors would also like to thank Professor Stephen Smale, Fields Medal awardee, who mentored the workshop organized by the first author, Dr. Zaineb Chelly Dagdia, at the \(5^{th}\) Heidelberg Laureate Forum. The outcome of an analysis that has been done during this workshop presents the current research manuscript. Additional thanks to all organizers of the \(5^{th}\) Heidelberg Laureate Forum, to the Heidelberg Institute for Theoretical Studies, Mathematisches Forschungszentrum Oberwolfach and Schloss Dagstuhl - Leibniz Center for Informatics, and to the Heidelberg Laureate Forum Foundation’s Scientific Committee.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zaineb Chelly Dagdia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The discussion in this paper about the challenges and importance of filling the gaps between the biological computation and computational biology communities represents the outcome of an analysis that has been done in the 5th Heidelberg Laureate Forum; specifically during a workshop https://scilogs.spektrum.de/hlf/experience-learn-share-heidelberg-laureate-forum/ organized by Dr. Zaineb Chelly Dagdia and mentored by Professor Stephen Smale (Fields Medal awardee). After the workshop, a collaboration was formed between Dr. Zaineb Chelly Dagdia who works on biological computation and two participants and contributors to the workshop, Pavel Avdeyev and Dr. Md. Shamsuzzoha Bayzid, who work on different areas in computational biology.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chelly Dagdia, Z., Avdeyev, P. & Bayzid, M.S. Biological computation and computational biology: survey, challenges, and discussion. Artif Intell Rev 54, 4169–4235 (2021). https://doi.org/10.1007/s10462-020-09951-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-020-09951-1

Keywords

Navigation