Skip to main content
Log in

Rotation forest based on multimodal genetic algorithm

基于多峰遗传算法的旋转森林

  • Published:
Journal of Central South University Aims and scope Submit manuscript

Abstract

In machine learning, randomness is a crucial factor in the success of ensemble learning, and it can be injected into tree-based ensembles by rotating the feature space. However, it is a common practice to rotate the feature space randomly. Thus, a large number of trees are required to ensure the performance of the ensemble model. This random rotation method is theoretically feasible, but it requires massive computing resources, potentially restricting its applications. A multimodal genetic algorithm based rotation forest (MGARF) algorithm is proposed in this paper to solve this problem. It is a tree-based ensemble learning algorithm for classification, taking advantage of the characteristic of trees to inject randomness by feature rotation. However, this algorithm attempts to select a subset of more diverse and accurate base learners using the multimodal optimization method. The classification accuracy of the proposed MGARF algorithm was evaluated by comparing it with the original random forest and random rotation ensemble methods on 23 UCI classification datasets. Experimental results show that the MGARF method outperforms the other methods, and the number of base learners in MGARF models is much fewer.

摘要

在机器学习中, 随机性是集成学习成功的一个关键因素, 可通过旋转特征空间将随机性注入到基于树的集成学习模型中。随机地旋转特征空间是一种常见的做法, 因此, 需要大量的树来保证集成模型的性能。该方法在理论上是可行的, 但它需要大量的计算资源, 因此, 本文提出了基于多峰遗传算法的旋转森林(MGARF)算法。该方法是一种基于树的分类集成学习算法, 利用树的特性通过特征旋转注入随机性, 利用多峰优化方法选择一个更多样化、更准确的基学习机子集。通过在23 个UCI 分类数据集上与原始随机森林和随机旋转集成方法进行比较, 评估了所提出的MGARF 算法的分类精度。实验结果表明, MGARF 方法的性能优于其他方法, 而且MGARF 模型中的基学习机的数量更少。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. SAGI O, ROKACH L. Ensemble learning: A survey [J]. WIREs Data Mining and Knowledge Discovery, 2018, 8(4): e1249. DOI: https://doi.org/10.1002/widm.1249.

    Article  Google Scholar 

  2. DIETTERICH T G. Machine-learning research: Four current directions [J]. AI Magazine, 1997, 18(4): 97–136. DOI: https://doi.org/10.1609/aimag.v18i4.1324.

    Google Scholar 

  3. DONG Xi-bin, YU Zhi-wen, CAO Wen-ming, SHI Yi-fan, MA Qian-li. A survey on ensemble learning [J]. Frontiers of Computer Science, 2020, 14(2): 241–258. DOI: https://doi.org/10.1007/s11704-019-8208-z.

    Article  Google Scholar 

  4. YU Wan-ke, ZHAO Chun-hui. Online fault diagnosis for industrial processes with bayesian network-based probabilistic ensemble learning strategy [J]. IEEE Transactions on Automation Science and Engineering, 2019, 16(4): 1922–1932. DOI: https://doi.org/10.1109/TASE.2019.2915286.

    Article  Google Scholar 

  5. WANG Zhen-ya, LU Chen, ZHOU Bo. Fault diagnosis for rotary machinery with selective ensemble neural networks [J]. Mechanical Systems and Signal Processing, 2018, 113(SI): 112–130. DOI: https://doi.org/10.1016/j.ymssp.2017.03.051.

    Article  Google Scholar 

  6. XIA Chong-kun, SU Cheng-li, CAO Jiang-tao, LI Ping. MultiBoost with ENN-based ensemble fault diagnosis method and its application in complicated chemical process [J]. Journal of Central South University, 2016, 23(5): 1183–1197. DOI: https://doi.org/10.1007/s11771-016-0368-5.

    Article  Google Scholar 

  7. SONG Yan, ZHANG Shu-jing, HE Bo, SHA Qi-xin, SHEN Yue, YAN Tian-hong, NIAN Rui, LENDASSE A. Gaussian derivative models and ensemble extreme learning machine for texture image classification [J]. Neurocomputing, 2018, 277(SI): 53–64. DOI: https://doi.org/10.1016/j.neucom.2017.01.113.

    Article  Google Scholar 

  8. MAO Ke-ming, DENG Zhuo-fu. Lung nodule image classification based on ensemble machine learning [J]. Journal of Medical Imaging and Health Informatics, 2016, 6(7): 1679–1685. DOI: https://doi.org/10.1166/jmihi.2016.1871.

    Article  Google Scholar 

  9. CHEN Cun-jian, DANTCHEVA A, ROSS A. An ensemble of patch-based subspaces for makeup-robust face recognition [J]. Information Fusion, 2016, 32: 80–92. DOI: https://doi.org/10.1016/j.inffus.2015.09.005.

    Article  Google Scholar 

  10. YAMAN M A, SUBASI A, RATTAY F. Comparison of random subspace and voting ensemble machine learning methods for face recognition [J]. Symmetry, 2018, 10(11): 651. DOI: https://doi.org/10.3390/sym10110651.

    Article  Google Scholar 

  11. KRAWCZYK B, MINKU L L, GAMA J, STEFANOWSKI J, WOZNIAK M. Ensemble learning for data stream analysis: A survey [J]. Information Fusion, 2017, 37: 132–156. DOI: https://doi.org/10.1016/j.inffus.2017.02.004.

    Article  Google Scholar 

  12. PIETRUCZUK L, RUTKOWSKI L, JAWORSKI M, DUDA P. How to adjust an ensemble size in stream data mining? [J]. Information Sciences, 2017, 381: 46–54. DOI: https://doi.org/10.1016/j.ins.2016.10.028.

    Article  MathSciNet  MATH  Google Scholar 

  13. TUMER K, GHOSH J. Analysis of decision boundaries in linearly combined neural classifiers [J]. Pattern Recognition, 1996, 29(2): 341–348. DOI: https://doi.org/10.1016/0031-3203(95)00085-2.

    Article  Google Scholar 

  14. BROWN G, WYATT J, HARRIS R, YAO Xin. Diversity creation methods: A survey and categorization [J]. Information Fusion, 2005, 6(1): 5–20. DOI: https://doi.org/10.1016/j.inffus.2004.04.004.

    Article  Google Scholar 

  15. NGOC P V, NGOC C V T, NGOC T V T, DUY D N. A C4.5 algorithm for english emotional classification [J]. Evolving Systems, 2019, 10(3): 425–451. DOI: https://doi.org/10.1007/s12530-017-9180-1.

    Article  Google Scholar 

  16. RAHBARI D, NICKRAY M. Task offloading in mobile fog computing by classification and regression tree [J]. Peer-to-peer Networking and Applications, 2020, 13(1): 104–122. DOI: https://doi.org/10.1007/s12083-019-00721-7.

    Article  Google Scholar 

  17. RAO Hai-di, SHI Xian-zhang, RODRIGUE A K, FENG Juanjuan, XIA Ying-chun, ELHOSENY M, YUAN Xiao-hui, GU Li-chuan. Feature selection based on artificial bee colony and gradient boosting decision tree[J]. Applied Soft Computing, 2019, 74: 634–642. DOI: https://doi.org/10.1016/j.asoc.2018.10.036.

    Article  Google Scholar 

  18. LI Mu-jin, XU Hong-hui, DENG Yong. Evidential decision tree based on belief entropy [J]. Entropy, 2019, 21(9). DOI:https://doi.org/10.3390/e21090897.

  19. BREIMAN L. Random forests [J]. Machine Learning, 2001, 45(1): 5–32. DOI: https://doi.org/10.1023/A:1010933404324.

    Article  MATH  Google Scholar 

  20. SCHONLAU M, ZOU R Y. The random forest algorithm for statistical learning [J]. The Stata Journal, 2020, 20(1): 3–29. DOI: https://doi.org/10.1177/1536867X20909688.

    Article  Google Scholar 

  21. RODRÍGUEZ J J, KUNCHEVA L I, ALONSO C J. Rotation forest: A new classifier ensemble method [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1619–1630. DOI: https://doi.org/10.1109/TPAMI.2006.211.

    Article  Google Scholar 

  22. BLASER R, FRYZLEWICZ P. Random rotation ensembles [J]. Journal of Machine Learning Research, 2016, 17(1): 1–26. [2020-01-13] https://www.jmlr.org/papers/volume17/blaser16a/blaser16a.pdf, 2021-1-12/.

    MathSciNet  MATH  Google Scholar 

  23. PHAM H, OLAFSSON S. Bagged ensembles with tunable parameters [J]. Computational Intelligence, 2019, 35(1): 184–203. DOI: https://doi.org/10.1111/coin.12198.

    Article  MathSciNet  Google Scholar 

  24. HANSEN L K, SALAMON P. Neural network ensembles[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12(10): 993–1001. DOI: https://doi.org/10.1109/34.58871.

    Article  Google Scholar 

  25. TAMA B A, RHEE K H. Tree-based classifier ensembles for early detection method of diabetes: an exploratory study [J]. Artificial Intelligence Review, 2019, 51(3): 355–370. DOI: https://doi.org/10.1007/s10462-017-9565-3.

    Article  Google Scholar 

  26. ZHOU Zhi-hua, WU Jian-xin, TANG Wei. Ensembling neural networks: Many could be better than all [J]. Artificial Intelligence, 2002, 137(1, 2): 239–263. DOI: https://doi.org/10.1016/S0004-3702(02)00190-X.

    Article  MathSciNet  MATH  Google Scholar 

  27. ZENG Xiao-dong, WONG D F, CHAO L S. Constructing better classifier ensemble based on weighted accuracy and diversity measure [J]. The Scientific World Journal, 2014: 961747. DOI: https://doi.org/10.1155/2014/961747.

  28. PEIMANKAR A, WEDDELL S J, JALAL T, LAPTHORN A C. Multi-objective ensemble forecasting with an application to power transformers [J]. Applied Soft Computing, 2018, 68: 233–248. DOI: https://doi.org/10.1016/j.asoc.2018.03.042.

    Article  Google Scholar 

  29. PETROWSKI A. A clearing procedure as a niching method for genetic algorithms [C]// Proceedings of IEEE International Conference on Evolutionary Computation. Nagoya Japan: IEEE, 1996: 798–803. DOI: https://doi.org/10.1109/ICEC.1996.542703.

    Google Scholar 

  30. MENGSHOEL O J, GOLDBERG D E. The crowding approach to niching in genetic algorithms [J]. Evolutionary Computation, 2008, 16(3): 315–354. DOI: https://doi.org/10.1162/evco.2008.16.3.315.

    Article  Google Scholar 

  31. GOLDBERG D E, RICHARDSON J. Genetic algorithms with sharing for multimodal function optimization [C]// Proceedings of the Second International Conference on Genetic Algorithms on Genetic Algorithms and Their Application. Cambridge, MA, USA: L. Erlbaum Associates, 1987: 41–49. [2021-01-13] https://dl.acm.org/doi/10.5555/42512.42519, 2021-1-13/.

    Google Scholar 

  32. LI Jian-ping, BALAZS M E, PARKS G T, CLARKSON P J. A species conserving genetic algorithm for multimodal function optimization [J]. Evolutionary Computation, 2002, 10(3): 207–234. DOI: https://doi.org/10.1162/106365602760234081.

    Article  Google Scholar 

  33. THOMSEN R. Multimodal optimization using crowding-based differential evolution [C]// Proceedings of the 2004 Congress on Evolutionary Computation. Portland, USA: IEEE, 2004(2): 1382–1389. DOI: https://doi.org/10.1109/CEC.2004.1331058.

    Google Scholar 

  34. LI Xiao-dong. Efficient differential evolution using speciation for multimodal function optimization [C]// Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation. New York, USA: Association for Computing Machinery, 2005: 873–880. DOI: https://doi.org/10.1145/1068009.1068156.

    Google Scholar 

  35. LI Wei, FAN Yao-chi, XU Qing-zheng. Evolutionary multimodal optimization based on bi-population and multimutation differential evolution [J]. International Journal of Computational Intelligence Systems, 2020, 13(1):1345–1367. DOI: https://doi.org/10.2991/ijcis.d.200826.001.

    Article  Google Scholar 

  36. WANG Zi-jia, ZHAN Zhi-hui, LIN Ying, YU Wei-jie, WANG Hua, KWONG S, ZHANG Jun. Automatic niching differential evolution with contour prediction approach for multimodal optimization problems [J]. IEEE Transactions on Evolutionary Computation, 2020, 24(1): 114–128. DOI: https://doi.org/10.1109/TEVC.2019.2910721.

    Article  Google Scholar 

  37. LIU Qing-xue, DU Sheng-zhi, van WYK B J, SUN Yan-xia. Niching particle swarm optimization based on Euclidean distance and hierarchical clustering for multimodal optimization [J]. Nonlinear Dynamics, 2020, 99(3): 2459–2477. DOI: https://doi.org/10.1007/s11071-019-05414-7.

    Article  Google Scholar 

  38. WANG Zi-jia, ZHAN Zhi-hui, LIN Ying, YU Wei-jie, YUAN Hua-qiang, GU Tian-long, KWONG S, ZHANG Jun. Dual-strategy differential evolution with affinity propagation clustering for multimodal optimization problems [J]. IEEE Transactions on Evolutionary Computation, 2018, 22(6): 894–908. DOI: https://doi.org/10.1109/TEVC.2017.2769108.

    Article  Google Scholar 

  39. LIU Qing-xue, DU Sheng-zhi, van WYK B J, SUN Yan-xia. Double-layer-clustering differential evolution multimodal optimization by speciation and self-adaptive strategies [J]. Information Sciences, 2021, 545: 465–486. DOI: https://doi.org/10.1016/j.ins.2020.09.008.

    Article  MathSciNet  Google Scholar 

  40. ZHAO Hong, ZHAN Zhi-hui, ZHANG Jun. Adaptive guidance-based differential evolution with iterative feedback archive strategy for multimodal optimization problems [C]// IEEE Congress on Evolutionary Computation. Glasgow, United Kingdom: IEEE, 2020. DOI: https://doi.org/10.1109/CEC48606.2020.9185582.

    Book  Google Scholar 

  41. CAO Yu-lian, ZHANG Han, LI Wen-feng, ZHOU Meng-chu, ZHANG Yu, CHAOVALITWONGSE W A. Comprehensive learning particle swarm optimization algorithm with local search for multimodal functions [J]. IEEE Transactions on Evolutionary Computation, 2019, 23(4): 718–731. DOI: https://doi.org/10.1109/TEVC.2018.2885075.

    Article  Google Scholar 

  42. RIM C, PIAO S, LI Guo, PAK U. A niching chaos optimization algorithm for multimodal optimization [J]. Soft Computing, 2018, 22(2): 621–633. DOI: https://doi.org/10.1007/s00500-016-2360-2.

    Article  MATH  Google Scholar 

  43. THIRUGNANASAMBANDAM K, PRAKASH S, SUBRAMANIAN V, POTHULA S, THIRUMAL V. Reinforced cuckoo search algorithm-based multimodal optimization [J]. Applied Intelligence, 2019, 49(6): 2059–2083. DOI: https://doi.org/10.1007/s10489-018-1355-3.

    Article  Google Scholar 

  44. SUN Tao, ZHOU Zhi-hua. Structural diversity for decision tree ensemble learning [J]. Frontiers of Computer Science, 2018, 12(3): 560–570. DOI: https://doi.org/10.1007/s11704-018-7151-8.

    Article  Google Scholar 

  45. DUBRULLE A A. Householder transformations revisited [J]. SIAM Journal on Matrix Analysis and Applications, 2000, 22(1): 33–40. DOI: https://doi.org/10.1137/S0895479898338561.

    Article  MathSciNet  MATH  Google Scholar 

  46. RODRIGUEZ-GALIANO V, SANCHEZ-CASTILLO M, CHICA-OLMO M, CHICA-RIVAS M. Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines [J]. Ore Geology Reviews, 2015, 71(SI): 804–818. DOI: https://doi.org/10.1016/j.oregeorev.2015.01.001.

    Article  Google Scholar 

  47. BREIMAN L, FRIEDMAN J H, OLSHEN R A, STONE C J. Classification and regression trees [M]. Belmont, CA: Wadsworth Advanced Books and Software, 1984. DOI: https://doi.org/10.1002/cyto.990080516.

    MATH  Google Scholar 

  48. CHEN Wei, XIE Xiao-shen, WANG Jia-le, PRADHAN B, HONG Hao-yuan, BUI D T, DUAN Zhao, MA Jian-quan. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility [J]. Catena, 2017, 151: 147–160. DOI: https://doi.org/10.1016/j.catena.2016.11.032.

    Article  Google Scholar 

  49. DUA D, GRAFF C. UCI Machine learning repository [EB/OL]. [2020-01-13] http://archive.ics.uci.edu/ml,2017/.

  50. COX N J, SCHECHTER C B. Speaking stata: how best to generate indicator or dummy variables [J]. Stata Journal, 2019, 19(1): 246–259. DOI: https://doi.org/10.1177/1536867X19830921.

    Article  Google Scholar 

  51. JANEZ D. Statistical comparisons of classifiers over multiple data sets[J]. Journal of Machine Learning Research, 2006, 7: 1–30.

    MathSciNet  MATH  Google Scholar 

  52. NISHIYAMA T, SEO T. The multivariate tukey-kramer multiple comparison procedure among four correlated mean vectors [J]. American Journal of Mathematical and Management Sciences, 2008, 28(1, 2): 115–130. DOI: https://doi.org/10.1080/01966324.2008.10737720.

    Article  MathSciNet  MATH  Google Scholar 

  53. KUNCHEVA L I. A bound on kappa-error diagrams for analysis of classifier ensembles [J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(3): 494–501. DOI: https://doi.org/10.1109/TKDE.2011.234.

    Article  Google Scholar 

  54. COHEN J. A coefficient of agreement for nominal scales [J]. Educational and Psychological Measurement, 1960, 20(1): 37–46. DOI: https://doi.org/10.1177/001316446002000104.

    Article  Google Scholar 

  55. FLIGHT L, JULIOUS S A. The disagreeable behaviour of the kappa statistic [J]. Pharmaceutical Statistics, 2015, 14(1): 74–78. DOI: https://doi.org/10.1002/pst.1659.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhe Xu  (徐喆).

Additional information

Foundation item

Project(61603274) supported by the National Natural Science Foundation of China; Project(2017KJ249) supported by the Research Project of Tianjin Municipal Education Commission, China

Contributors

XU Zhe provided the concept, performed the experiments, and wrote the manuscript. NI Wei-chen helped with the data analysis and revision of the manuscript. JI Yue-hui contributed to manuscript preparation and the revision of the manuscript.

Conflict of interest

XU Zhe, NI Wei-chen and JI Yue-hui declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Z., Ni, Wc. & Ji, Yh. Rotation forest based on multimodal genetic algorithm. J. Cent. South Univ. 28, 1747–1764 (2021). https://doi.org/10.1007/s11771-021-4730-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11771-021-4730-x

Key words

关键词

Navigation