当前位置: X-MOL 学术Curr. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cancer Diagnosis and Disease Gene Identification via Statistical Machine Learning
Current Bioinformatics ( IF 2.4 ) Pub Date : 2020-10-31 , DOI: 10.2174/1574893615666200207094947
Liuyuan Chen 1 , Juntao Li 2 , Mingming Chang 2
Affiliation  

Diagnosing cancer and identifying the disease gene by using DNA microarray gene expression data are the hot topics in current bioinformatics. This paper is devoted to the latest development in cancer diagnosis and gene selection via statistical machine learning. A support vector machine is firstly introduced for the binary cancer diagnosis. Then, 1-norm support vector machine, doubly regularized support vector machine, adaptive huberized support vector machine and other extensions are presented to improve the performance of gene selection. Lasso, elastic net, partly adaptive elastic net, group lasso, sparse group lasso, adaptive sparse group lasso and other sparse regression methods are also introduced for performing simultaneous binary cancer classification and gene selection. In addition to introducing three strategies for reducing multiclass to binary, methods of directly considering all classes of data in a learning model (multi_class support vector, sparse multinomial regression, adaptive multinomial regression and so on) are presented for performing multiple cancer diagnosis. Limitations and promising directions are also discussed.



中文翻译:

通过统计机器学习进行癌症诊断和疾病基因鉴定

利用DNA芯片基因表达数据诊断癌症和识别疾病基因是当前生物信息学的热门话题。本文致力于通过统计机器学习进行癌症诊断和基因选择的最新进展。首先引入支持向量机用于二元癌症诊断。然后,提出了1-范数支持向量机,双正则支持向量机,自适应huberized支持向量机和其他扩展,以提高基因选择的性能。还引入套索,弹性网,部分自适应弹性网,套索组,稀疏组套索,自适应稀疏组套索和其他稀疏回归方法,以同时进行二元癌症分类和基因选择。除了介绍将多类分解为二进制的三种策略外,还提出了直接考虑学习模型中所有类数据的方法(多类支持向量,稀疏多项式回归,自适应多项式回归等)来进行多癌症诊断。还讨论了局限性和有希望的方向。

更新日期:2020-10-31
down
wechat
bug