当前位置: X-MOL 学术Methods Inf. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Method to Extract Feature Variables Contributed in Nonlinear Machine Learning Prediction.
Methods of Information in Medicine ( IF 1.7 ) Pub Date : 2020-05-07 , DOI: 10.1055/s-0040-1701615
Mayumi Suzuki 1 , Takuma Shibahara 1 , Yoshihiro Muragaki 2
Affiliation  

Abstract

Background Although advances in prediction accuracy have been made with new machine learning methods, such as support vector machines and deep neural networks, these methods make nonlinear machine learning models and thus lack the ability to explain the basis of their predictions. Improving their explanatory capabilities would increase the reliability of their predictions.

Objective Our objective was to develop a factor analysis technique that enables the presentation of the feature variables used in making predictions, even in nonlinear machine learning models.

Methods A factor analysis technique was consisted of two techniques: backward analysis technique and factor extraction technique. We developed a factor extraction technique extracted feature variables that was obtained from the posterior probability distribution of a machine learning model which was calculated by backward analysis technique.

Results In evaluation, using gene expression data from prostate tumor patients and healthy subjects, the prediction accuracy of a model of deep neural networks was approximately 5% better than that of a model of support vector machines. Then the rate of concordance between the feature variables extracted in an earlier report using Jensen–Shannon divergence and the ones extracted in this report using backward elimination using Hilbert–Schmidt independence criteria was 40% for the top five variables, 40% for the top 10, and 49% for the top 100.

Conclusion The results showed that models can be evaluated from different viewpoints by using different factor extraction techniques. In the future, we hope to use this technique to verify the characteristics of features extracted by factor extraction technique, and to perform clinical studies using the genes, we extracted in this experiment.

Authors' Contributions

All persons who meet authorship criteria are listed as authors, and all authors certify that they have participated sufficiently in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript. M.S. and T. S., in particular, contributed equally to this work.




中文翻译:

一种提取非线性机器学习预测中贡献的特征变量的方法。

摘要

背景技术 尽管使用支持向量机和深度神经网络等新的机器学习方法已在预测准确性方面取得了进步,但这些方法构成了非线性机器学习模型,因此缺乏解释其预测基础的能力。提高他们的解释能力将增加他们预测的可靠性。

目的 我们的目标是开发一种因子分析技术,即使在非线性机器学习模型中,也可以呈现用于预测的特征变量。

方法 因素分析技术由两种技术组成:后向分析技术和因素提取技术。我们开发了一种因子提取技术,该技术提取了从机器学习模型的后验概率分布中获得的特征变量,该机器学习模型是通过向后分析技术计算得出的。

结果 在评估中,使用来自前列腺肿瘤患者和健康受试者的基因表达数据,深度神经网络模型的预测准确性比支持向量机模型的预测准确性高约5%。然后,在较早的报告中使用詹森-香农散度提取的特征变量与在本报告中使用希尔伯特-施密特独立性准则使用后向消除方法提取的特征变量之间的一致性比率,前五个变量为40%,前十个变量为40% ,前100名占49%。

结论 结果表明,使用不同的因子提取技术可以从不同的角度评估模型。将来,我们希望使用该技术来验证通过因子提取技术提取的特征的特征,并使用本实验中提取的基因进行临床研究。

作者的贡献

符合作者资格标准的所有人员均被列为作者,并且所有作者均证明他们已充分参与工作,对内容承担公共责任,包括参与稿件的概念,设计,分析,撰写或修订。尤其是MS和TS,为这项工作做出了同等贡献。


更新日期:2020-05-07
down
wechat
bug