当前位置: X-MOL 学术Interdiscip. Sci. Comput. Life Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
BDBM 1.0: A Desktop Application for Efficient Retrieval and Processing of High-Quality Sequence Data and Application to the Identification of the Putative Coffea S-Locus.
Interdisciplinary Sciences: Computational Life Sciences ( IF 4.8 ) Pub Date : 2019-02-04 , DOI: 10.1007/s12539-019-00320-3
Noé Vázquez 1, 2 , Hugo López-Fernández 1, 2, 3, 4, 5 , Cristina P Vieira 4, 5 , Florentino Fdez-Riverola 1, 2, 3 , Jorge Vieira 4, 5 , Miguel Reboiro-Jato 1, 2, 3
Affiliation  

Nowadays, bioinformatics is one of the most important areas in modern biology and the creation of high-quality scientific software supporting this recent research area is one of the core activities of many researchers. In this context, high-quality sequence datasets are needed to perform inferences on the evolution of species, genes, and gene families, or to get evidence for adaptive amino acid evolution, among others. Nevertheless, sequence data are very often spread over several databases, many useful genomes and transcriptomes are non-annotated, the available annotation is not for the desired coding sequence isoform, and/or is unlikely to be accurate. Moreover, although the FASTA text-based format is quite simple and usable by most software applications, there are a number of issues that may be critical depending on the software used to analyse such files. Therefore, researchers without training in informatics often use a fraction of all available data. The above issues can be addressed using already available software applications, but there is no easy-to-use single piece of software that allows performing all these tasks within the same graphical interface, such as the one here presented, named BDBM (Blast DataBase Manager). BDBM can be used to efficiently get gene sequences from annotated and non-annotated genomes and transcriptomes. Moreover, it can be used to look for alternatives to existing annotations and to easily create reliable custom databases. Such databases are essential to prepare high-quality datasets. The analyses that we have performed on the Coffea canephora genome using BDBM aimed at the identification of the S-locus region (that harbours the genes involved in gametophytic self-incompatibility) led to the conclusion that there are two likely regions, one on chromosome 2 (around region 6600000-6650000), and another on chromosome 5 (around 15830000-15930000). Such findings are discussed in the context of the Rubiaceae gametophytic self-incompatibility evolution.

中文翻译:

BDBM 1.0:高效检索和处理高质量序列数据的桌面应用程序,并用于鉴定假定的咖啡S-基因座。

如今,生物信息学已成为现代生物学中最重要的领域之一,支持这一最新研究领域的高质量科学软件的创建已成为许多研究人员的核心活动之一。在这种情况下,需要高质量的序列数据集来推断物种,基因和基因家族的进化,或者获得适应性氨基酸进化的证据。然而,序列数据经常散布在几个数据库中,许多有用的基因组和转录组未注释,可用的注释不适用于所需的编码序列同工型,和/或不太可能是准确的。此外,尽管FASTA基于文本的格式非常简单,并且可以被大多数软件应用程序使用,根据用于分析此类文件的软件,可能存在许多至关重要的问题。因此,未经信息学培训的研究人员经常使用所有可用数据的一小部分。可以使用已经存在的软件应用程序来解决上述问题,但是没有一个易于使用的软件可以在同一图形界面中执行所有这些任务,例如此处介绍的名为BDBM(Blast数据库管理器)的软件。 )。BDBM可用于从带注释和不带注释的基因组和转录组中高效获取基因序列。而且,它可以用于寻找现有注释的替代方法,并轻松创建可靠的自定义数据库。此类数据库对于准备高质量数据集至关重要。我们使用BDBM对canffea canephora基因组进行的分析旨在鉴定S-基因座区域(该区域含有与配子体自我不相容性有关的基因),得出的结论是,存在两个可能的区域,一个位于染色体2 (在6600000-6650000附近),另一个在5号染色体上(在15830000-15930000附近)。在茜草科配子体自交不亲和进化的背景下讨论了这些发现。
更新日期:2019-11-01
down
wechat
bug