Skip to main content
Log in

BDBM 1.0: A Desktop Application for Efficient Retrieval and Processing of High-Quality Sequence Data and Application to the Identification of the Putative Coffea S-Locus

  • Original Research Article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

Nowadays, bioinformatics is one of the most important areas in modern biology and the creation of high-quality scientific software supporting this recent research area is one of the core activities of many researchers. In this context, high-quality sequence datasets are needed to perform inferences on the evolution of species, genes, and gene families, or to get evidence for adaptive amino acid evolution, among others. Nevertheless, sequence data are very often spread over several databases, many useful genomes and transcriptomes are non-annotated, the available annotation is not for the desired coding sequence isoform, and/or is unlikely to be accurate. Moreover, although the FASTA text-based format is quite simple and usable by most software applications, there are a number of issues that may be critical depending on the software used to analyse such files. Therefore, researchers without training in informatics often use a fraction of all available data. The above issues can be addressed using already available software applications, but there is no easy-to-use single piece of software that allows performing all these tasks within the same graphical interface, such as the one here presented, named BDBM (Blast DataBase Manager). BDBM can be used to efficiently get gene sequences from annotated and non-annotated genomes and transcriptomes. Moreover, it can be used to look for alternatives to existing annotations and to easily create reliable custom databases. Such databases are essential to prepare high-quality datasets. The analyses that we have performed on the Coffea canephora genome using BDBM aimed at the identification of the S-locus region (that harbours the genes involved in gametophytic self-incompatibility) led to the conclusion that there are two likely regions, one on chromosome 2 (around region 6600000–6650000), and another on chromosome 5 (around 15830000–15930000). Such findings are discussed in the context of the Rubiaceae gametophytic self-incompatibility evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://flybase.org/blast/.

  2. https://docker.com/.

  3. http://coffee-genome.org/.

  4. http://emboss.open-bio.org/.

  5. http://bedtools.readthedocs.io.

  6. https://blast.ncbi.nlm.nih.gov.

  7. http://www.ncbi.nlm.nih.gov/sutils/splign.

  8. https://www.ncbi.nlm.nih.gov/sutils/static/prosplign/prosplign.html.

  9. http://www.sing-group.org/BDBM/download.html.

  10. https://hub.docker.com/r/singgroup/bdbm/.

  11. https://www.xpra.org/.

  12. https://github.com/sing-group/BDBM.

  13. https://www.sing-group.org/BDBM/usecases.html.

  14. http://coffee-genome.org.

  15. http://www.sing-group.org/seda/.

References

  1. López-Fernández H, Duque P, Henriques S, Vázquez N, Fdez-Riverola F, Vieira CP, Reboiro-Jato M, Vieira J (2019) A bioinformatics protocol for quickly creating large-scale phylogenetic trees. In: Fdez-Riverola F, Mohamad MS, Rocha M, De Paz JF, González P (eds) Practical applications of computational biology and bioinformatics, 12th international conference. Springer International Publishing, Cham, pp 88–96

    Google Scholar 

  2. Attrill H, Falls K, Goodman JL, Millburn GH, Antonazzo G, Rey AJ, Marygold SJ (2016) FlyBase Consortium: FlyBase: establishing a Gene Group resource for Drosophila melanogaster. Nucleic Acids Res 44:D786–D792

    Article  PubMed  CAS  Google Scholar 

  3. Salzberg SL (2007) Genome re-annotation: a wiki solution? Genome Biol 8:102

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  4. Tripp HJ, Hewson I, Boyarsky S, Stuart JM, Zehr JP (2011) Misannotations of rRNA can now generate 90% false positive protein matches in metatranscriptomic studies. Nucleic Acids Res 39:8792–8802

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  5. Promponas VJ, Iliopoulos I, Ouzounis CA (2015) Annotation inconsistencies beyond sequence similarity-based function prediction - phylogeny and genome structure. Stand Genom Sci 10:108

    Article  CAS  Google Scholar 

  6. Schnoes AM, Brown SD, Dodevski I, Babbitt PC (2009) Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comput Biol 5:e1000605

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Markova-Raina P, Petrov D (2011) High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes. Genome Res 21:863–874

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  8. Reis M, Sousa-Guimarães S, Vieira CP, Sunkel CE, Vieira J (2011) Drosophila genes that affect meiosis duration are among the meiosis related genes that are more often found duplicated. PLoS One 6:e17512

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. da Leprevost F, Grüning BA, Alves Aflitos S, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Vera Alvarez R, Griss J, Nesvizhskii AI, Perez-Riverol Y (2017) BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics 33:2580–2582

    Article  CAS  Google Scholar 

  10. Belmann P, Dröge J, Bremges A, McHardy AC, Sczyrba A, Barton MD (2015) Bioboxes: standardised containers for interchangeable bioinformatics software. GigaScience. https://doi.org/10.1186/s13742-015-0087-0

    Article  PubMed  PubMed Central  Google Scholar 

  11. Hung L-H, Kristiyanto D, Lee SB, Yeung KY (2016) GUIdock: using docker containers with a common graphics user interface to address the reproducibility of research. PLoS One 11:e0152686

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  12. Conagin CHTM, Mendes AJT (1961) Pesquisas citológicas e genéticas em três espécies de Coffea: auto-incompatibilidade em Coffea canephora pierre ex froehner. Bragantia 20:788–804

    Article  Google Scholar 

  13. Pochet P, Devreux M, Vallaeys G, Gilles A (1959) Recherches sur l autostérilité du caféier robusta (Coffea canephora Pierre). Institut National pour l’Étude Agronomique du Congo Belge (INEAC), Bruxelles

    Google Scholar 

  14. Lashermes P, Couturon E, Moreau N, Paillard M, Louarn J (1996) Inheritance and genetic mapping of self-incompatibility in Coffea canephora Pierre. Theor Appl Genet 93:458–462

    Article  PubMed  CAS  Google Scholar 

  15. Lashermes P, Combes MC, Prakash NS, Trouslot P, Lorieux M, Charrier A (2001) Genetic linkage map of Coffea canephora: effect of segregation distortion and analysis of recombination rate in male and female meioses. Genome 44:589–596

    Article  PubMed  CAS  Google Scholar 

  16. Foote HC, Ride JP, Franklin-Tong VE, Walker EA, Lawrence MJ, Franklin FC (1994) Cloning and expression of a distinctive class of self-incompatibility (S) gene from Papaver rhoeas L. Proc Natl Acad Sci USA 91:2265–2269

    Article  PubMed  CAS  Google Scholar 

  17. Walker EA, Ride JP, Kurup S, Franklin-Tong VE, Lawrence MJ, Franklin FC (1996) Molecular analysis of two functional homologues of the S3 allele of the Papaver rhoeas self-incompatibility gene isolated from different populations. Plant Mol Biol 30:983–994

    Article  PubMed  CAS  Google Scholar 

  18. Roalson EH, McCubbin AG (2003) S-RNases and sexual incompatibility: structure, functions, and evolutionary perspectives. Mol Phylogenet Evol 29:490–506

    Article  PubMed  CAS  Google Scholar 

  19. Igic B, Kohn JR (2001) Evolutionary relationships among self-incompatibility RNases. Proc Natl Acad Sci 98:13167–13171

    Article  PubMed  CAS  Google Scholar 

  20. Steinbachs JE, Holsinger KE (2002) S-RNase-mediated gametophytic self-incompatibility is ancestral in eudicots. Mol Biol Evol 19:825–829

    Article  PubMed  CAS  Google Scholar 

  21. Vieira J, Fonseca NA, Vieira CP (2008) An S-RNase-based gametophytic self-incompatibility system evolved only once in eudicots. J Mol Evol 67:179–190

    Article  PubMed  CAS  Google Scholar 

  22. Aguiar B, Vieira J, Cunha AE, Fonseca NA, Iezzoni A, van Nocker S, Vieira CP (2015) Convergent evolution at the gametophytic self-incompatibility system in Malus and Prunus. PLoS One 10:e0126138

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Ramanauskas K, Igić B (2017) The evolutionary history of plant T2/S-type ribonucleases. PeerJ 5:e3790

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Sonneveld T (2005) Loss of pollen-S function in two self-compatible selections of Prunus avium is associated with deletion/mutation of an S haplotype-specific F-Box gene. Plant Cell Online 17:37–51

    Article  CAS  Google Scholar 

  25. Kubo K-I, Entani T, Takara A, Wang N, Fields AM, Hua Z, Toyoda M, Kawashima S-I, Ando T, Isogai A, Kao T-H, Takayama S (2010) Collaborative non-self recognition system in S-RNase-based self-incompatibility. Science 330:796–799

    Article  PubMed  CAS  Google Scholar 

  26. Luu D-T, Qin X, Laublin G, Yang Q, Morse D, Cappadocia M (2001) Rejection of S-heteroallelic pollen by a dual-specific S-RNase in Solanum chacoense predicts a multimeric si pollen component. Genetics 159:329–335

    PubMed  PubMed Central  CAS  Google Scholar 

  27. Hua Z, Kao T-h (2006) Identification and characterization of components of a putative Petunia S-locus F-Box-containing E3 ligase complex involved in S-RNase-based self-incompatibility. Plant Cell Online 18:2531–2553

    Article  CAS  Google Scholar 

  28. Hua Z, Meng X, Kao T-h (2007) Comparison of Petunia inflata S-locus F-Box protein (Pi SLF) with Pi SLF like proteins reveals its unique function in S-RNase based self-incompatibility. Plant Cell Online 19:3593–3609

    Article  CAS  Google Scholar 

  29. Tao R, Iezzoni A (2010) The S-RNase-based gametophytic self-incompatibility system in Prunus exhibits distinct genetic and molecular features. Sci Hortic 124:423–433

    Article  CAS  Google Scholar 

  30. Sassa H, Kakui H, Miyamoto M, Suzuki Y, Hanada T, Ushijima K, Kusaba M, Hirano H, Koba T (2007) S locus F-Box brothers: multiple and pollen-specific F-box genes with S haplotype-specific polymorphisms in apple and Japanese pear. Genetics 175:1869–1881

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  31. Minamikawa M, Kakui H, Wang S, Kotoda N, Kikuchi S, Koba T, Sassa H (2010) Apple S locus region represents a large cluster of related, polymorphic and pollen-specific F-box genes. Plant Mol Biol 74:143–154

    Article  PubMed  CAS  Google Scholar 

  32. Aguiar B, Vieira J, Cunha AE, Fonseca NA, Reboiro-Jato D, Reboiro-Jato M, Fdez-Riverola F, Raspé O, Vieira CP (2013) Patterns of evolution at the gametophytic self-incompatibility Sorbus aucuparia (Pyrinae) S pollen genes support the non-self recognition by multiple factors model. J Exp Bot 64:2423–2434

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Pratas MI, Aguiar B, Vieira J, Nunes V, Teixeira V, Fonseca NA, Iezzoni A, Nocker S van, Vieira CP (2018) Inferences on specificity recognition at the Malus × domestica gametophytic self-incompatibility system. Sci Rep 8:1717

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Wheeler D, Newbigin E (2007) Expression of 10 S-class SLF-like genes in Nicotiana alata pollen and its implications for understanding the pollen factor of the S locus. Genetics 177:2171–2180

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Williams JS, Der JP, dePamphilis CW, Kao T (2014) Transcriptome analysis reveals the same 17 S-Locus F-Box genes in two haplotypes of the self-incompatibility locus of Petunia inflata. Plant Cell 26:2873–2888

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  36. Kubo K, Paape T, Hatakeyama M, Entani T, Takara A, Kajihara K, Tsukahara M, Shimizu-Inatsugi R, Shimizu KK, Takayama S (2015) Gene duplication and genetic exchange drive the evolution of S-RNase-based self-incompatibility in Petunia. Nat Plants 1:14005

    Article  PubMed  CAS  Google Scholar 

  37. Clark AG (1992) Evolutionary inferences from molecular characterization of self-incompatibility alleles. In: Clark AG, Takahata N (eds) Mechanisms of molecular evolution. Sinauer, Sunderland, pp 79–108

    Google Scholar 

  38. Vieira J, Morales-Hojas R, Santos RAM, Vieira CP (2007) Different positively selected sites at the gametophytic self-incompatibility pistil S-RNase gene in the Solanaceae and Rosaceae (Prunus, Pyrus, and Malus). J Mol Evol 65:175–185

    Article  PubMed  CAS  Google Scholar 

  39. Nunes MDS, Santos RAM, Ferreira SM, Vieira J, Vieira CP (2006) Variability patterns and positively selected sites at the gametophytic self-incompatibility pollen SFB gene in a wild self-incompatible Prunus spinosa (Rosaceae) population. New Phytol 172:577–587

    Article  PubMed  CAS  Google Scholar 

  40. Tsukamoto T, Potter D, Tao R, Vieira CP, Vieira J, Iezzoni AF (2008) Genetic and molecular characterization of three novel S-haplotypes in sour cherry (Prunus cerasus L.). J Exp Bot 59:3169–3185

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  41. Vieira J, Ferreira PG, Aguiar B, Fonseca NA, Vieira CP (2010) Evolutionary patterns at the RNase based gametophytic self-incompatibility system in two divergent Rosaceae groups (Maloideae and Prunus). BMC Evol Biol 10:200

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  42. Nowak MD, Davis AP, Anthony F, Yoder AD (2011) Expression and trans-specific polymorphism of self-incompatibility RNases in Coffea (Rubiaceae). PLoS One 6:e21019

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  43. Reboiro-Jato D, Reboiro-Jato M, Fdez-Riverola F, Vieira CP, Fonseca NA, Vieira J (2012) ADOPS—automatic detection of positively selected sites. J Integr Bioinform 9:200

    Article  PubMed  Google Scholar 

  44. Asquini E, Gerdol M, Gasperini D, Igic B, Graziosi G, Pallavicini A (2011) S-RNase-like sequences in styles of Coffea (Rubiaceae). Evidence for S-RNase based gametophytic self-incompatibility? Trop. Plant Biol 4:237–249

    CAS  Google Scholar 

  45. Dereeper A, Bocs S, Rouard M, Guignon V, Ravel S, Tranchant-Dubreuil C, Poncet V, Garsmeur O, Lashermes P, Droc G (2015) The coffee genome hub: a resource for coffee genomes. Nucleic Acids Res 43:D1028–D1035

    Article  PubMed  CAS  Google Scholar 

  46. Denoeud F, Carretero-Paulet L, Dereeper A, Droc G, Guyot R, Pietrella M, Zheng C, Alberti A, Anthony F, Aprea G, Aury J-M, Bento P, Bernard M, Bocs S, Campa C, Cenci A, Combes M-C, Crouzillat D, Silva CD, Daddiego L, Bellis FD, Dussert S, Garsmeur O, Gayraud T, Guignon V, Jahn K, Jamilloux V, Joët T, Labadie K, Lan T, Leclercq J, Lepelley M, Leroy T, Li L-T, Librado P, Lopez L, Muñoz A, Noel B, Pallavicini A, Perrotta G, Poncet V, Pot D, Priyono, Rigoreau M, Rouard M, Rozas J, Tranchant-Dubreuil C, VanBuren R, Zhang Q, Andrade AC, Argout X, Bertrand B, Kochko A de, Graziosi G, Henry RJ, Jayarama, Ming R, Nagai C, Rounsley S, Sankoff D, Giuliano G, Albert VA, Wincker P, Lashermes P (2014) The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345:1181–1184

    Article  PubMed  CAS  Google Scholar 

  47. Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet TIG 16:276–277

    Article  PubMed  CAS  Google Scholar 

  48. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinform Oxf Engl 26:841–842

    Article  CAS  Google Scholar 

  49. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinform 10:421

    Article  CAS  Google Scholar 

  50. Kapustin Y, Souvorov A, Tatusova T, Lipman D (2008) Splign: algorithms for computing spliced alignments with identification of paralogs. Biol Direct 3:20

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  51. Okada K, Tonaka N, Taguchi T, Ichikawa T, Sawamura Y, Nakanishi T, Takasaki-Yasuda T (2011) Related polymorphic F-box protein genes between haplotypes clustering in the BAC contig sequences around the S-RNase of Japanese pear. J Exp Bot 62:1887–1902

    Article  PubMed  CAS  Google Scholar 

  52. Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP (2012) MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542

    Article  PubMed  PubMed Central  Google Scholar 

  53. Nguyen KD, Pan Y (2011) Multiple sequence alignment based on dynamic weighted guidance tree. Int J Bioinform Res Appl 7:168

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This article is a result of the project Norte-01-0145-FEDER-000008—Porto Neurosciences and Neurologic Disease Research Initiative at I3S, supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (FEDER). H. López-Fernández is supported by a post-doctoral fellowship from Xunta de Galicia (ED481B 2016/068-0). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia) and FEDER (European Union). SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from University of Vigo for hosting its IT infrastructure.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo López-Fernández.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 632 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vázquez, N., López-Fernández, H., Vieira, C.P. et al. BDBM 1.0: A Desktop Application for Efficient Retrieval and Processing of High-Quality Sequence Data and Application to the Identification of the Putative Coffea S-Locus. Interdiscip Sci Comput Life Sci 11, 57–67 (2019). https://doi.org/10.1007/s12539-019-00320-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-019-00320-3

Keywords

Navigation