Abstract
The analysis of complex biological datasets beyond DNA scenarios is gaining increasing interest in current bioinformatics. Particularly, protein sequence data introduce additional complexity layers that impose new challenges from a computational perspective. This work is aimed at investigating GPU solutions to address these issues in a representative algorithm from the phylogenetics field: Fitch’s parsimony. GPU strategies are adopted in accordance with the protein-based formulation of the problem, defining an optimized kernel that takes advantage of data parallelism at the calculations associated with different amino acids. In order to understand the relationship between problem sizes and GPU capabilities, an extensive evaluation on a wide range of GPUs is conducted, covering all the recent NVIDIA GPU architectures—from Kepler to Turing. Experimental results on five real-world datasets point out the benefits that imply the exploitation of state-of-the-art GPUs, representing a fitting approach to address the increasing hardness of protein sequence datasets.
Similar content being viewed by others
References
Alachiotis N, Stamatakis A (2011) FPGA acceleration of the phylogenetic parsimony kernel? In: Proceedings of FPL 2011. IEEE, pp 417–422
Aluru S, Jammula N (2014) A review of hardware acceleration for computational genomics. IEEE Des Test 31(1):19–30
Attwood TK, Pettifer SR, Thorne D (2016) Bioinformatics challenges at the interface of biology and computer science: mind the gap. Wiley, Oxford
Ayres DL et al (2019) BEAGLE 3: improved performance, scaling, and usability for a high-performance computing library for statistical phylogenetics. Syst Biol 68:1052–1061. https://doi.org/10.1093/sysbio/syz020
Bao J, Xia H, Zhou J, Liu X, Wang G (2013) Efficient implementation of MrBayes on multi-GPU. Mol Biol Evolut 30(6):1471–1479
Blazewicz J, Frohmberg W, Kierzynka M, Wojciechowski P (2013) G-MSA—a GPU-based, fast and accurate algorithm for multiple sequence alignment. J Parallel Distrib Comput 73(1):32–41
Block H, Maruyama T (2014) An FPGA hardware acceleration of the indirect calculation of tree lengths method for phylogenetic tree reconstruction. In: Proceedings of FPL 2014. IEEE, pp 1–4
Block H, Maruyama T (2017) An FPGA hardware implementation approach for a phylogenetic tree reconstruction algorithm with incremental tree optimization. In: Proceedings of FPL 2017. IEEE, pp 1–8
Bouktila D, Khalfallah Y, Habachi-Houimli Y, Mezghani-Khemakhem M, Makni M, Makni H (2014) Large-scale analysis of NBS domain-encoding resistance gene analogs in triticeae. Genet Mol Biol 37(3):598–610
Dias PJ, Sá-Correia I (2013) The drug:H+ antiporters of family 2 (DHA2), siderophore transporters (ARN) and glutathione:h+antiporters (GEX) have a common evolutionary origin in hemiascomycete yeasts. BMC Genom 14(901):1–22
Farber R (2017) Parallel programming with OpenACC, 1st edn. Morgan Kaufmann Publishers, Cambridge
Fitch W (1972) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20(4):406–416
Gazis R et al (2016) The genome of Xylona heveae provides a window into fungal endophytism. Fungal Biol 120(1):26–42
Guerreiro J, Ilic A, Roma N, Tomás P (2018) GPGPU power modelling for multi-domain voltage-frequency scaling. In: Proceedings of IEEE HPCA 2018. IEEE, pp 530–538
Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17(1):333–351
Han X, Chakrabortti A, Zhu J, Liang Z, Li J (2016) Sequencing and functional annotation of the whole genome of the filamentous fungus Aspergillus westerdijkiae. BMC Genom 17(633):1–14
Hua GJ, Hung CL, Lin CY, Wu FC, Chan YW, Tang CY (2017) MGUPGMA: a fast UPGMA algorithm with multiple graphics processing units using NCCL. Evolut Bioinform 13:1–7
Hung CL, Lin YS, Lin CY, Chung YC, Chung YF (2015) CUDA ClustalW: an efficient parallel algorithm for progressive multiple sequence alignment on multi-GPUs. Comput Biol Chem 58:62–68
Izquierdo-Carrasco F, Alachiotis N, Berger S, Flouri T, Pissis SP, Stamatakis A (2013) A generic vectorization scheme and a GPU kernel for the phylogenetic likelihood library. In: Proceedings of IEEE IPDPS 2013. IEEE, pp 530–538
Jünger D, Hundt C, González-Domínguez J, Schmidt B (2017) Speed and accuracy improvement of higher-order epistasis detection on CUDA-enabled GPUs. Clust Comput 20(3):1899–1908
Kaeli DR, Mistry P, Schaa D, Zhang DP (2015) Heterogeneous computing with OpenCL 2.0. Morgan Kaufmann Publishers, Waltham
Klus P et al (2012) BarraCUDA—a fast short read sequence aligner using graphics processing units. BMC Res Notes 5(27):1–7
Kuan L, Neves J, Pratas F, Tomás P, Sousa L (2014) Accelerating phylogenetic inference on GPUs: an OpenACC and CUDA comparison. In: Proceedings of the 2nd International Work-Conference on Bioinformatics and Biomedical Engineering, pp 589–600
Lin YS, Lin CY, Hung CL, Chung YC, Lee KZ (2015) GPU-UPGMA: high-performance computing for UPGMA algorithm based on graphics processing units. Concurr Comput Pract Exp 27(13):3403–3414
Ling C, Benkrid K, Hamada T (2012) High performance phylogenetic analysis on CUDA-compatible GPUs. ACM SIGARCH Comput Archit News 40(5):52–57
Ling C, Gao J, Lu G (2016) Phylogenetic likelihood estimation on GPUs using vertical partitioning scheme. In: Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE, pp 1210–1217
Majumder T, Sarkar S, Pande PP, Kalyanaraman A (2012) NoC-based hardware accelerator for breakpoint phylogeny. IEEE Trans Comput 61(6):857–869
Martins WS, Rangel TF, Lucas DCS, Ferreira EB, Caceres EN (2012) Phylogenetic distance computation using CUDA. In: de Souto MC, Kann MG (eds) BSB 2012: advances in bioinformatics and computational biology, LNCS, vol 7409. Springer, Berlin, pp 168–178
Mirande JM (2017) Combined phylogeny of ray-finned fishes (Actinopterygii) and the use of morphological characters in large-scale analyses. Cladistics 33(4):333–350
Morgenstern I et al (2012) A molecular phylogeny of thermophilic fungi. Fungal Biol 116(4):489–502
Nobile M, Cazzaniga P, Tangherloni A, Besozzi D (2017) Graphics processing units in bioinformatics, computational biology and systems biology. Brief Bioinform 18(5):870–885
Ohue M, Shimoda T, Suzuki S, Matsuzaki Y, Ishida T, Akiyama Y (2014) MEGADOCK 4.0: an ultra-high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22):3281–3283
Pratas F, Trancoso P, Sousa L, Stamatakis A, Shi G, Kindratenko V (2012) Fine-grain parallelism using multi-core, Cell/BE, and GPU systems. Parallel Comput 38(8):365–390
Quang D, Guan Y, Parker SCJ (2018) YAMDA: thousandfold speedup of EM-based motif discovery using deep learning libraries and GPU. Bioinformatics 34(20):3578–3580
Rokas A (2011) Phylogenetic analysis of protein sequence data using the randomized axelerated maximum likelihood (RAxML) program. Curr Protoc Mol Biol 96:19.11.1–19.11.14
Roshan UW, Moret BME, Williams TL, Warnow T (2004) Rec-I-DCM3: a fast algorithmic technique for reconstructing large phylogenetic trees. In: Proceedings of the 3rd IEEE Computational Systems Bioinformatics Conference. IEEE, pp 98–109
Santander-Jiménez S, Ilic A, Sousa L, Vega-Rodríguez MA (2017) Accelerating the phylogenetic parsimony function on heterogeneous systems. Concurr Comput Pract Exp 29(8):1–15
Santander-Jiménez S, Vega-Rodríguez MA, Vicente-Viola J, Sousa L (2019) Comparative assessment of GPGPU technologies to accelerate objective functions: a case study on parsimony. J Parallel Distrib Comput 126:67–81
Sheskin DJ (2011) Handbook of parametric and nonparametric statistical procedures, 5th edn. Chapman & Hall/CRC, New York
Thomson R, Shaffer H (2010) Sparse supermatrices for phylogenetic inference: taxonomy, alignment, rogue taxa, and the phylogeny of living turtles. Syst Biol 59:42–58
Vazquez-Ortiz KE, Richer JM, Lesaint D (2016) Strategies for phylogenetic reconstruction—for the maximum parsimony problem. In: Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2016), pp 226–236
Warnow T (2017) Computational phylogenetics: an introduction to designing methods for phylogeny estimation. Cambridge University Press, Cambridge
Wilt N (2013) The CUDA handbook: a comprehensive guide to GPU programming. Addison Wesley, Pearson
Wu D et al (2009) A phylogeny-driven genomic encyclopedia of bacteria and archaea. Nature 462(7276):1056–1060
Xilinx: breathe new life into your data center with alveo adaptable accelerator cards. White Paper: Alveo Accelerator Cards, 1–12 (2018)
Acknowledgements
This work was partially funded by the AEI (State Research Agency, Spain) and the ERDF (European Regional Development Fund, EU), under the contract TIN2016-76259-P (PROTEIN Project), as well as Portuguese national funds through FCT (Fundação para a Ciência e a Tecnologia, Portugal) Projects UIDB/50021/2020 and LISBOA-01-0145-FEDER-031901 (PTDC/CCI-COM/31901/2017, HiPErBio). Sergio Santander-Jiménez is supported by the Post-Doctoral Fellowship from FCT under Grant SFRH/BPD/119220/2016.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Santander-Jiménez, S., Vega-Rodríguez, M.A., Zahinos-Márquez, A. et al. GPU acceleration of Fitch’s parsimony on protein data: from Kepler to Turing. J Supercomput 76, 9827–9853 (2020). https://doi.org/10.1007/s11227-020-03225-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03225-x