当前位置: X-MOL 学术Hum. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An embedded method for gene identification problems involving unwanted data heterogeneity.
Human Genomics ( IF 4.5 ) Pub Date : 2019-10-22 , DOI: 10.1186/s40246-019-0228-0
Meng Lu 1
Affiliation  

BACKGROUND Modern applications such as bioinformatics collecting data in various ways can easily result in heterogeneous data. Traditional variable selection methods assume samples are independent and identically distributed, which however is not suitable for these applications. Some existing statistical models capable of taking care of unwanted variation were developed for gene identification involving heterogeneous data, but they lack model predictability and suffer from variable redundancy. RESULTS By accounting for the unwanted heterogeneity effectively, our method have shown its superiority over several state-of-the art methods, which is validated by the experimental results in both unsupervised and supervised gene identification problems. Moreover, we also applied our method to a pan-cancer study where our method can identify the most discriminative genes best distinguishing different cancer types. CONCLUSIONS This article provides an alternative gene identification method that can accounting for unwanted data heterogeneity. It is a promising method to provide new insights into the complex cancer biology and clues for understanding tumorigenesis and tumor progression.

中文翻译:

一种用于涉及不想要的数据异质性的基因识别问题的嵌入式方法。

背景技术诸如以各种方式收集数据的生物信息学之类的现代应用可以容易地导致异构数据。传统的变量选择方法假定样本是独立且相同分布的,但是不适用于这些应用。已开发出一些能够处理不必要变异的统计模型,用于涉及异类数据的基因鉴定,但是它们缺乏模型可预测性,并且存在可变冗余的问题。结果通过有效地解决了不必要的异质性,我们的方法已显示出其优于几种最先进方法的优越性,这在无监督和有监督的基因鉴定问题中的实验结果得到了验证。此外,我们还将我们的方法应用于泛癌研究中,在该研究中,我们的方法可以识别最能区分不同癌症类型的最具区别性的基因。结论本文提供了一种替代性的基因鉴定方法,可以解决不需要的数据异质性问题。这是一种有前途的方法,可为复杂的癌症生物学提供新见解,并为了解肿瘤发生和肿瘤进展提供线索。
更新日期:2020-04-22
down
wechat
bug