当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An efficient gene selection method for microarray data based on LASSO and BPSO.
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2019-12-30 , DOI: 10.1186/s12859-019-3228-0
Ying Xiong 1, 2, 3 , Qing-Hua Ling 4 , Fei Han 1, 2 , Qing-Hua Liu 4
Affiliation  

BACKGROUND The main goal of successful gene selection for microarray data is to find compact and predictive gene subsets which could improve the accuracy. Though a large pool of available methods exists, selecting the optimal gene subset for accurate classification is still very challenging for the diagnosis and treatment of cancer. RESULTS To obtain the most predictive genes subsets without filtering out critical genes, a gene selection method based on least absolute shrinkage and selection operator (LASSO) and an improved binary particle swarm optimization (BPSO) is proposed in this paper. To avoid overfitting of LASSO, the initial gene pool is divided into clusters based on their structure. LASSO is then employed to select high predictive genes and further calculate the contribution value which indicates the genes' sensitivity to samples' classes. With the second-level gene pool established by double filter strategy, the BPSO encoding the contribution information obtained from LASSO is improved to perform gene selection. Moreover, from the perspective of the bit change probability, a new mapping function is defined to guide the updating of the particle to select the more predictive genes in the improved BPSO. CONCLUSIONS With the compact gene pool obtained by double filter strategies, the improved BPSO could select the optimal gene subsets with high probability. The experimental results on several public microarray data with extreme learning machine verify the effectiveness of the proposed method compared to the relevant methods.

中文翻译:

一种基于LASSO和BPSO的高效基因选择芯片数据的方法。

背景技术成功地为微阵列数据选择基因的主要目的是寻找紧凑且可预测的基因子集,以提高准确性。尽管存在大量可用的方法,但是为癌症的诊断和治疗选择正确的最佳基因子集进行准确分类仍然非常具有挑战性。结果为了获得预测性最高的基因子集而不滤除关键基因,本文提出了一种基于最小绝对收缩和选择算子(LASSO)和改进的二进制粒子群算法(BPSO)的基因选择方法。为了避免LASSO过度拟合,根据基因结构将初始基因库分为多个簇。然后使用LASSO来选择高预测基因,并进一步计算表明该基因的贡献值。对样品类别的敏感性。利用通过双重过滤策略建立的二级基因库,对编码从LASSO获得的贡献信息的BPSO进行了改进,以进行基因选择。此外,从位改变概率的角度来看,定义了一个新的映射函数,以指导粒子的更新以选择改进的BPSO中的更具预测性的基因。结论利用双重过滤策略获得的紧凑型基因库,改进的BPSO可以高概率地选择最佳基因子集。与相关方法相比,使用极限学习机在多个公共微阵列数据上的实验结果证明了该方法的有效性。改进了对从LASSO获得的贡献信息进行编码的BPSO,以进行基因选择。而且,从位改变概率的角度来看,定义了新的映射功能以指导粒子的更新以选择改进的BPSO中的更具预测性的基因。结论利用双重过滤策略获得的紧凑型基因库,改进的BPSO可以高概率地选择最佳基因子集。与相关方法相比,使用极限学习机在多个公共微阵列数据上的实验结果证明了该方法的有效性。改进了对从LASSO获得的贡献信息进行编码的BPSO,以进行基因选择。此外,从位改变概率的角度来看,定义了一个新的映射函数,以指导粒子的更新以选择改进的BPSO中的更具预测性的基因。结论利用双重过滤策略获得的紧凑型基因库,改进的BPSO可以高概率地选择最佳基因子集。与相关方法相比,使用极限学习机在多个公共微阵列数据上的实验结果证明了该方法的有效性。定义了一个新的作图功能,以指导粒子的更新,以在改进的BPSO中选择更具预测性的基因。结论利用双重过滤策略获得的紧凑型基因库,改进的BPSO可以高概率地选择最佳基因子集。与相关方法相比,使用极限学习机在多个公共微阵列数据上的实验结果证明了该方法的有效性。定义了一个新的作图功能,以指导粒子的更新,以在改进的BPSO中选择更具预测性的基因。结论利用双重过滤策略获得的紧凑型基因库,改进的BPSO可以高概率地选择最佳基因子集。与相关方法相比,使用极限学习机在多个公共微阵列数据上的实验结果证明了该方法的有效性。
更新日期:2019-12-30
down
wechat
bug