当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GTX.Digest.VCF: an online NGS data interpretation system based on intelligent gene ranking and large-scale text mining.
BMC Medical Genomics ( IF 2.1 ) Pub Date : 2019-12-20 , DOI: 10.1186/s12920-019-0637-x
Yanhuang Jiang 1 , Chengkun Wu 2 , Yanghui Zhang 3 , Shaowei Zhang 1 , Shuojun Yu 1 , Peng Lei 1 , Qin Lu 1 , Yanwei Xi 4 , Hua Wang 3, 5 , Zhuo Song 1
Affiliation  

BACKGROUND An important task in the interpretation of sequencing data is to highlight pathogenic genes (or detrimental variants) in the field of Mendelian diseases. It is still challenging despite the recent rapid development of genomics and bioinformatics. A typical interpretation workflow includes annotation, filtration, manual inspection and literature review. Those steps are time-consuming and error-prone in the absence of systematic support. Therefore, we developed GTX.Digest.VCF, an online DNA sequencing interpretation system, which prioritizes genes and variants for novel disease-gene relation discovery and integrates text mining results to provide literature evidence for the discovery. Its phenotype-driven ranking and biological data mining approach significantly speed up the whole interpretation process. RESULTS The GTX.Digest.VCF system is freely available as a web portal at http://vcf.gtxlab.com for academic research. Evaluation on the DDD project dataset demonstrates an accuracy of 77% (235 out of 305 cases) for top-50 genes and an accuracy of 41.6% (127 out of 305 cases) for top-5 genes. CONCLUSIONS GTX.Digest.VCF provides an intelligent web portal for genomics data interpretation via the integration of bioinformatics tools, distributed parallel computing, biomedical text mining. It can facilitate the application of genomic analytics in clinical research and practices.

中文翻译:

GTX.Digest.VCF:基于智能基因排名和大规模文本挖掘的在线NGS数据解释系统。

背景技术在测序数据的解释中的重要任务是突出孟德尔疾病领域中的致病基因(或有害变体)。尽管最近基因组学和生物信息学的快速发展,它仍然具有挑战性。典型的解释工作流程包括注释,过滤,手动检查和文献复习。在没有系统支持的情况下,这些步骤既耗时又容易出错。因此,我们开发了GTX.Digest.VCF,这是一种在线DNA测序解释系统,该系统为新的疾病与基因的关系发现确定了基因和变体的优先级,并整合了文本挖掘结果以为发现提供文献证据。它的表型驱动排序和生物数据挖掘方法大大加快了整个解释过程。结果GTX.Digest。VCF系统可通过http://vcf.gtxlab.com上的Web门户免费获得,以进行学术研究。对DDD项目数据集的评估表明,前50个基因的准确性为77%(305个案例中的235个),而前5个基因的准确性为41.6%(305个案例中的127个)。结论GTX.Digest.VCF通过集成生物信息学工具,分布式并行计算,生物医学文本挖掘,为基因组学数据解释提供了一个智能的Web门户。它可以促进基因组分析在临床研究和实践中的应用。VCF通过集成生物信息学工具,分布式并行计算,生物医学文本挖掘,为基因组学数据解释提供了一个智能的Web门户。它可以促进基因组分析在临床研究和实践中的应用。VCF通过集成生物信息学工具,分布式并行计算,生物医学文本挖掘,为基因组学数据解释提供了一个智能的Web门户。它可以促进基因组分析在临床研究和实践中的应用。
更新日期:2019-12-20
down
wechat
bug