Abstract
Nowadays, bioinformatics is one of the most important areas in modern biology and the creation of high-quality scientific software supporting this recent research area is one of the core activities of many researchers. In this context, high-quality sequence datasets are needed to perform inferences on the evolution of species, genes, and gene families, or to get evidence for adaptive amino acid evolution, among others. Nevertheless, sequence data are very often spread over several databases, many useful genomes and transcriptomes are non-annotated, the available annotation is not for the desired coding sequence isoform, and/or is unlikely to be accurate. Moreover, although the FASTA text-based format is quite simple and usable by most software applications, there are a number of issues that may be critical depending on the software used to analyse such files. Therefore, researchers without training in informatics often use a fraction of all available data. The above issues can be addressed using already available software applications, but there is no easy-to-use single piece of software that allows performing all these tasks within the same graphical interface, such as the one here presented, named BDBM (Blast DataBase Manager). BDBM can be used to efficiently get gene sequences from annotated and non-annotated genomes and transcriptomes. Moreover, it can be used to look for alternatives to existing annotations and to easily create reliable custom databases. Such databases are essential to prepare high-quality datasets. The analyses that we have performed on the Coffea canephora genome using BDBM aimed at the identification of the S-locus region (that harbours the genes involved in gametophytic self-incompatibility) led to the conclusion that there are two likely regions, one on chromosome 2 (around region 6600000–6650000), and another on chromosome 5 (around 15830000–15930000). Such findings are discussed in the context of the Rubiaceae gametophytic self-incompatibility evolution.
Similar content being viewed by others
Notes
References
López-Fernández H, Duque P, Henriques S, Vázquez N, Fdez-Riverola F, Vieira CP, Reboiro-Jato M, Vieira J (2019) A bioinformatics protocol for quickly creating large-scale phylogenetic trees. In: Fdez-Riverola F, Mohamad MS, Rocha M, De Paz JF, González P (eds) Practical applications of computational biology and bioinformatics, 12th international conference. Springer International Publishing, Cham, pp 88–96
Attrill H, Falls K, Goodman JL, Millburn GH, Antonazzo G, Rey AJ, Marygold SJ (2016) FlyBase Consortium: FlyBase: establishing a Gene Group resource for Drosophila melanogaster. Nucleic Acids Res 44:D786–D792
Salzberg SL (2007) Genome re-annotation: a wiki solution? Genome Biol 8:102
Tripp HJ, Hewson I, Boyarsky S, Stuart JM, Zehr JP (2011) Misannotations of rRNA can now generate 90% false positive protein matches in metatranscriptomic studies. Nucleic Acids Res 39:8792–8802
Promponas VJ, Iliopoulos I, Ouzounis CA (2015) Annotation inconsistencies beyond sequence similarity-based function prediction - phylogeny and genome structure. Stand Genom Sci 10:108
Schnoes AM, Brown SD, Dodevski I, Babbitt PC (2009) Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comput Biol 5:e1000605
Markova-Raina P, Petrov D (2011) High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes. Genome Res 21:863–874
Reis M, Sousa-Guimarães S, Vieira CP, Sunkel CE, Vieira J (2011) Drosophila genes that affect meiosis duration are among the meiosis related genes that are more often found duplicated. PLoS One 6:e17512
da Leprevost F, Grüning BA, Alves Aflitos S, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Vera Alvarez R, Griss J, Nesvizhskii AI, Perez-Riverol Y (2017) BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics 33:2580–2582
Belmann P, Dröge J, Bremges A, McHardy AC, Sczyrba A, Barton MD (2015) Bioboxes: standardised containers for interchangeable bioinformatics software. GigaScience. https://doi.org/10.1186/s13742-015-0087-0
Hung L-H, Kristiyanto D, Lee SB, Yeung KY (2016) GUIdock: using docker containers with a common graphics user interface to address the reproducibility of research. PLoS One 11:e0152686
Conagin CHTM, Mendes AJT (1961) Pesquisas citológicas e genéticas em três espécies de Coffea: auto-incompatibilidade em Coffea canephora pierre ex froehner. Bragantia 20:788–804
Pochet P, Devreux M, Vallaeys G, Gilles A (1959) Recherches sur l autostérilité du caféier robusta (Coffea canephora Pierre). Institut National pour l’Étude Agronomique du Congo Belge (INEAC), Bruxelles
Lashermes P, Couturon E, Moreau N, Paillard M, Louarn J (1996) Inheritance and genetic mapping of self-incompatibility in Coffea canephora Pierre. Theor Appl Genet 93:458–462
Lashermes P, Combes MC, Prakash NS, Trouslot P, Lorieux M, Charrier A (2001) Genetic linkage map of Coffea canephora: effect of segregation distortion and analysis of recombination rate in male and female meioses. Genome 44:589–596
Foote HC, Ride JP, Franklin-Tong VE, Walker EA, Lawrence MJ, Franklin FC (1994) Cloning and expression of a distinctive class of self-incompatibility (S) gene from Papaver rhoeas L. Proc Natl Acad Sci USA 91:2265–2269
Walker EA, Ride JP, Kurup S, Franklin-Tong VE, Lawrence MJ, Franklin FC (1996) Molecular analysis of two functional homologues of the S3 allele of the Papaver rhoeas self-incompatibility gene isolated from different populations. Plant Mol Biol 30:983–994
Roalson EH, McCubbin AG (2003) S-RNases and sexual incompatibility: structure, functions, and evolutionary perspectives. Mol Phylogenet Evol 29:490–506
Igic B, Kohn JR (2001) Evolutionary relationships among self-incompatibility RNases. Proc Natl Acad Sci 98:13167–13171
Steinbachs JE, Holsinger KE (2002) S-RNase-mediated gametophytic self-incompatibility is ancestral in eudicots. Mol Biol Evol 19:825–829
Vieira J, Fonseca NA, Vieira CP (2008) An S-RNase-based gametophytic self-incompatibility system evolved only once in eudicots. J Mol Evol 67:179–190
Aguiar B, Vieira J, Cunha AE, Fonseca NA, Iezzoni A, van Nocker S, Vieira CP (2015) Convergent evolution at the gametophytic self-incompatibility system in Malus and Prunus. PLoS One 10:e0126138
Ramanauskas K, Igić B (2017) The evolutionary history of plant T2/S-type ribonucleases. PeerJ 5:e3790
Sonneveld T (2005) Loss of pollen-S function in two self-compatible selections of Prunus avium is associated with deletion/mutation of an S haplotype-specific F-Box gene. Plant Cell Online 17:37–51
Kubo K-I, Entani T, Takara A, Wang N, Fields AM, Hua Z, Toyoda M, Kawashima S-I, Ando T, Isogai A, Kao T-H, Takayama S (2010) Collaborative non-self recognition system in S-RNase-based self-incompatibility. Science 330:796–799
Luu D-T, Qin X, Laublin G, Yang Q, Morse D, Cappadocia M (2001) Rejection of S-heteroallelic pollen by a dual-specific S-RNase in Solanum chacoense predicts a multimeric si pollen component. Genetics 159:329–335
Hua Z, Kao T-h (2006) Identification and characterization of components of a putative Petunia S-locus F-Box-containing E3 ligase complex involved in S-RNase-based self-incompatibility. Plant Cell Online 18:2531–2553
Hua Z, Meng X, Kao T-h (2007) Comparison of Petunia inflata S-locus F-Box protein (Pi SLF) with Pi SLF like proteins reveals its unique function in S-RNase based self-incompatibility. Plant Cell Online 19:3593–3609
Tao R, Iezzoni A (2010) The S-RNase-based gametophytic self-incompatibility system in Prunus exhibits distinct genetic and molecular features. Sci Hortic 124:423–433
Sassa H, Kakui H, Miyamoto M, Suzuki Y, Hanada T, Ushijima K, Kusaba M, Hirano H, Koba T (2007) S locus F-Box brothers: multiple and pollen-specific F-box genes with S haplotype-specific polymorphisms in apple and Japanese pear. Genetics 175:1869–1881
Minamikawa M, Kakui H, Wang S, Kotoda N, Kikuchi S, Koba T, Sassa H (2010) Apple S locus region represents a large cluster of related, polymorphic and pollen-specific F-box genes. Plant Mol Biol 74:143–154
Aguiar B, Vieira J, Cunha AE, Fonseca NA, Reboiro-Jato D, Reboiro-Jato M, Fdez-Riverola F, Raspé O, Vieira CP (2013) Patterns of evolution at the gametophytic self-incompatibility Sorbus aucuparia (Pyrinae) S pollen genes support the non-self recognition by multiple factors model. J Exp Bot 64:2423–2434
Pratas MI, Aguiar B, Vieira J, Nunes V, Teixeira V, Fonseca NA, Iezzoni A, Nocker S van, Vieira CP (2018) Inferences on specificity recognition at the Malus × domestica gametophytic self-incompatibility system. Sci Rep 8:1717
Wheeler D, Newbigin E (2007) Expression of 10 S-class SLF-like genes in Nicotiana alata pollen and its implications for understanding the pollen factor of the S locus. Genetics 177:2171–2180
Williams JS, Der JP, dePamphilis CW, Kao T (2014) Transcriptome analysis reveals the same 17 S-Locus F-Box genes in two haplotypes of the self-incompatibility locus of Petunia inflata. Plant Cell 26:2873–2888
Kubo K, Paape T, Hatakeyama M, Entani T, Takara A, Kajihara K, Tsukahara M, Shimizu-Inatsugi R, Shimizu KK, Takayama S (2015) Gene duplication and genetic exchange drive the evolution of S-RNase-based self-incompatibility in Petunia. Nat Plants 1:14005
Clark AG (1992) Evolutionary inferences from molecular characterization of self-incompatibility alleles. In: Clark AG, Takahata N (eds) Mechanisms of molecular evolution. Sinauer, Sunderland, pp 79–108
Vieira J, Morales-Hojas R, Santos RAM, Vieira CP (2007) Different positively selected sites at the gametophytic self-incompatibility pistil S-RNase gene in the Solanaceae and Rosaceae (Prunus, Pyrus, and Malus). J Mol Evol 65:175–185
Nunes MDS, Santos RAM, Ferreira SM, Vieira J, Vieira CP (2006) Variability patterns and positively selected sites at the gametophytic self-incompatibility pollen SFB gene in a wild self-incompatible Prunus spinosa (Rosaceae) population. New Phytol 172:577–587
Tsukamoto T, Potter D, Tao R, Vieira CP, Vieira J, Iezzoni AF (2008) Genetic and molecular characterization of three novel S-haplotypes in sour cherry (Prunus cerasus L.). J Exp Bot 59:3169–3185
Vieira J, Ferreira PG, Aguiar B, Fonseca NA, Vieira CP (2010) Evolutionary patterns at the RNase based gametophytic self-incompatibility system in two divergent Rosaceae groups (Maloideae and Prunus). BMC Evol Biol 10:200
Nowak MD, Davis AP, Anthony F, Yoder AD (2011) Expression and trans-specific polymorphism of self-incompatibility RNases in Coffea (Rubiaceae). PLoS One 6:e21019
Reboiro-Jato D, Reboiro-Jato M, Fdez-Riverola F, Vieira CP, Fonseca NA, Vieira J (2012) ADOPS—automatic detection of positively selected sites. J Integr Bioinform 9:200
Asquini E, Gerdol M, Gasperini D, Igic B, Graziosi G, Pallavicini A (2011) S-RNase-like sequences in styles of Coffea (Rubiaceae). Evidence for S-RNase based gametophytic self-incompatibility? Trop. Plant Biol 4:237–249
Dereeper A, Bocs S, Rouard M, Guignon V, Ravel S, Tranchant-Dubreuil C, Poncet V, Garsmeur O, Lashermes P, Droc G (2015) The coffee genome hub: a resource for coffee genomes. Nucleic Acids Res 43:D1028–D1035
Denoeud F, Carretero-Paulet L, Dereeper A, Droc G, Guyot R, Pietrella M, Zheng C, Alberti A, Anthony F, Aprea G, Aury J-M, Bento P, Bernard M, Bocs S, Campa C, Cenci A, Combes M-C, Crouzillat D, Silva CD, Daddiego L, Bellis FD, Dussert S, Garsmeur O, Gayraud T, Guignon V, Jahn K, Jamilloux V, Joët T, Labadie K, Lan T, Leclercq J, Lepelley M, Leroy T, Li L-T, Librado P, Lopez L, Muñoz A, Noel B, Pallavicini A, Perrotta G, Poncet V, Pot D, Priyono, Rigoreau M, Rouard M, Rozas J, Tranchant-Dubreuil C, VanBuren R, Zhang Q, Andrade AC, Argout X, Bertrand B, Kochko A de, Graziosi G, Henry RJ, Jayarama, Ming R, Nagai C, Rounsley S, Sankoff D, Giuliano G, Albert VA, Wincker P, Lashermes P (2014) The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345:1181–1184
Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet TIG 16:276–277
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinform Oxf Engl 26:841–842
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinform 10:421
Kapustin Y, Souvorov A, Tatusova T, Lipman D (2008) Splign: algorithms for computing spliced alignments with identification of paralogs. Biol Direct 3:20
Okada K, Tonaka N, Taguchi T, Ichikawa T, Sawamura Y, Nakanishi T, Takasaki-Yasuda T (2011) Related polymorphic F-box protein genes between haplotypes clustering in the BAC contig sequences around the S-RNase of Japanese pear. J Exp Bot 62:1887–1902
Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP (2012) MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542
Nguyen KD, Pan Y (2011) Multiple sequence alignment based on dynamic weighted guidance tree. Int J Bioinform Res Appl 7:168
Acknowledgements
This article is a result of the project Norte-01-0145-FEDER-000008—Porto Neurosciences and Neurologic Disease Research Initiative at I3S, supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (FEDER). H. López-Fernández is supported by a post-doctoral fellowship from Xunta de Galicia (ED481B 2016/068-0). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia) and FEDER (European Union). SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from University of Vigo for hosting its IT infrastructure.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Vázquez, N., López-Fernández, H., Vieira, C.P. et al. BDBM 1.0: A Desktop Application for Efficient Retrieval and Processing of High-Quality Sequence Data and Application to the Identification of the Putative Coffea S-Locus. Interdiscip Sci Comput Life Sci 11, 57–67 (2019). https://doi.org/10.1007/s12539-019-00320-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-019-00320-3