A Novel Gene Selection Algorithm based on Sparse Representation and Minimum-redundancy Maximum-relevancy of Maximum Compatibility Center,Current Proteomics

当前位置： X-MOL 学术 › Curr. Proteom. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Novel Gene Selection Algorithm based on Sparse Representation and Minimum-redundancy Maximum-relevancy of Maximum Compatibility Center
Current Proteomics ( IF 0.5 ) Pub Date : 2019-09-30 , DOI: 10.2174/1570164616666190123144020
Min Chen ₁ , Yi Zhang ₂ , Zejun Li ₁ , Ang Li ₁ , Wenhua Liu ₁ , Liubin Liu ₃ , Zheng Chen ₁

Affiliation

Background: Tumor classification is important for accurate diagnosis and personalized treatment and has recently received great attention. Analysis of gene expression profile has shown relevant biological significance and thus has become a research hotspot and a new challenge for bio-data mining. In the research methods, some algorithms can identify few genes but with great time complexity, some algorithms can get small time complex methods but with unsatisfactory classification accuracy, this article proposed a new extraction method for gene expression profile.

Methods: In this paper, we propose a classification method for tumor subtypes based on the Minimum- Redundancy Maximum-Relevancy (MRMR) of maximum compatibility center. First, we performed a fuzzy clustering of gene expression profiles based on the compatibility relation. Next, we used the sparse representation coefficient to assess the importance of the gene for the category, extracted the top-ranked genes, and removed the uncorrelated genes. Finally, the MRMR search strategy was used to select the characteristic gene, reject the redundant gene, and obtain the final subset of characteristic genes.

Results: Our method and four others were tested on four different datasets to verify its effectiveness. Results show that the classification accuracy and standard deviation of our method are better than those of other methods.

Conclusion: Our proposed method is robust, adaptable, and superior in classification. This method can help us discover the susceptibility genes associated with complex diseases and understand the interaction between these genes. Our technique provides a new way of thinking and is important to understand the pathogenesis of complex diseases and prevent diseases, diagnosis and treatment.

中文翻译：

基于稀疏表示和最大兼容中心最小冗余最大相关性的基因选择新算法

背景：肿瘤分类对于准确诊断和个性化治疗很重要，最近受到了极大关注。基因表达谱的分析已显示出相关的生物学意义，因此已成为研究热点和生物数据挖掘的新挑战。在研究方法中，一些算法只能识别很少的基因，但是时间复杂度很高；有些算法可以得到小的时间复杂的方法，但是分类精度不高，本文提出了一种新的基因表达谱提取方法。

方法：在本文中，我们提出了一种基于最大相容性中心的最小冗余最大相关性（MRMR）的肿瘤亚型分类方法。首先，我们基于兼容性关系对基因表达谱进行了模糊聚类。接下来，我们使用稀疏表示系数来评估该基因对于类别的重要性，提取排名最高的基因，并删除不相关的基因。最后，使用MRMR搜索策略选择特征基因，拒绝冗余基因，并获得特征基因的最终子集。

结果：我们的方法和其他四个方法在四个不同的数据集上进行了测试，以验证其有效性。结果表明，该方法的分类精度和标准差均优于其他方法。

结论：我们提出的方法是鲁棒的，适应性强的并且在分类上是优越的。这种方法可以帮助我们发现与复杂疾病相关的易感基因，并了解这些基因之间的相互作用。我们的技术提供了一种新的思维方式，对于理解复杂疾病的发病机理以及预防疾病，诊断和治疗非常重要。

更新日期：2019-09-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11