当前位置: X-MOL 学术Genom. Proteom. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
gFACs: Gene Filtering, Analysis, and Conversion to Unify Genome Annotations Across Alignment and Gene Prediction Frameworks.
Genomics, Proteomics & Bioinformatics ( IF 11.5 ) Pub Date : 2019-08-19 , DOI: 10.1016/j.gpb.2019.04.002
Madison Caballero 1 , Jill Wegrzyn 1
Affiliation  

Published genomes frequently contain erroneous gene models that represent issues associated with identification of open reading frames, start sites, splice sites, and related structural features. The source of these inconsistencies is often traced back to integration across text file formats designed to describe long read alignments and predicted gene structures. In addition, the majority of gene prediction frameworks do not provide robust downstream filtering to remove problematic gene annotations, nor do they represent these annotations in a format consistent with current file standards. These frameworks also lack consideration for functional attributes, such as the presence or absence of protein domains that can be used for gene model validation. To provide oversight to the increasing number of published genome annotations, we present a software package, the Gene Filtering, Analysis, and Conversion (gFACs), to filter, analyze, and convert predicted gene models and alignments. The software operates across a wide range of alignment, analysis, and gene prediction files with a flexible framework for defining gene models with reliable structural and functional attributes. gFACs supports common downstream applications, including genome browsers, and generates extensive details on the filtering process, including distributions that can be visualized to further assess the proposed gene space. gFACs is freely available and implemented in Perl with support from BioPerl libraries at https://gitlab.com/PlantGenomicsLab/gFACs.

中文翻译:


gFAC:基因过滤、分析和转换,以跨比对和基因预测框架统一基因组注释。



已发表的基因组经常包含错误的基因模型,这些模型代表与开放阅读框、起始位点、剪接位点和相关结构特征的识别相关的问题。这些不一致的根源通常可以追溯到旨在描述长读比对和预测基因结构的文本文件格式的集成。此外,大多数基因预测框架不提供强大的下游过滤来删除有问题的基因注释,也不以符合当前文件标准的格式表示这些注释。这些框架还缺乏对功能属性的考虑,例如是否存在可用于基因模型验证的蛋白质结构域。为了对越来越多的已发表基因组注释进行监督,我们推出了一个软件包,即基因过滤、分析和转换 (gFAC),用于过滤、分析和转换预测的基因模型和比对。该软件可运行广泛的比对、分析和基因预测文件,并具有灵活的框架,用于定义具有可靠结构和功能属性的基因模型。 gFAC 支持常见的下游应用程序,包括基因组浏览器,并生成有关过滤过程的广泛详细信息,包括可以可视化以进一步评估拟议基因空间的分布。 gFAC 可免费获取,并在 BioPerl 库的支持下以 Perl 实现,网址为 https://gitlab.com/PlantGenomicsLab/gFACs。
更新日期:2019-11-01
down
wechat
bug