Skip to main content
Log in

A co-training method based on entropy and multi-criteria

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Co-training method is a branch of semi-supervised learning, which improves the performance of classifier through the complementary effect of two views. In co-training algorithm, the selection of unlabeled data often adopts the high confidence degree strategy. Obviously, the higher confidence of data signifies the higher accuracy of prediction. Unfortunately, high confidence selection strategy is not always effective in improving classifier performance. In this paper, a co-training method based on entropy and multi-criteria is proposed. Firstly, the data set is divided into two views with the same amount of information by entropy. Then, the clustering criterion and confidence criterion are adopted to select unlabeled data in view 1 and view 2, respectively. It can solve the problem that high confidence criterion is not always valid. Different choices can better play the complementary role of co-training, thus supplement what the other view does not have. In addition, the role of labeled data is fully considered in multi-criteria in order to select more valuable unlabeled data. Experimental results on several UCI data sets and one artificial data set show the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Gong NZ, Frank M, Mittal P (2017) SybilBelief: a semi-supervised learning approach for structure-based Sybil detection. IEEE Trans Inf Forensics Secur 9(6):976–987

    Article  Google Scholar 

  2. Ashfaq RAR, Wang XZ, Huang JZ, Haider A, Yu-Lin H (2017) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Scienc Int J 378(3):484–497

    Article  Google Scholar 

  3. Tanha J, Someren MV, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370

    Article  Google Scholar 

  4. Li J, Zhu Q (2019) Semi-supervised self-training method based on an optimum-path Forest. IEEE Access 7:36388–36399

    Article  Google Scholar 

  5. Jiang B, Chen H, Yuan B, Xin Y (2017) Scalable graph-based semi-supervised learning through sparse Bayesian model. IEEE Trans Knowl Data Eng 29(12):2758–2771

    Article  Google Scholar 

  6. Meyer SS, Rossiter H, Brookes MJ, Woolrichcet MW, Bestmannd S, Barnesa GR (2017) Using generative models to make probabilistic statements about hippocampal engagement in MEG. NeuroImage 149(2):468–482

    Article  Google Scholar 

  7. Zhang X, Song Q, Liu R, Wang W, Jiao L (2017) Modified co-training with spectral and spatial views for Semisupervised Hyperspectral image classification. IEEE J Selected Top Appl Earth Observ Remote Sens 7(6):2044–2055

    Article  Google Scholar 

  8. Appice A, Guccione P, Malerba D (2016) A novel spectral-spatial co-training algorithm for the transductive classification of hyperspectral imagery data. Pattern Recogn 63(10):229–245

    Google Scholar 

  9. Bin Y, Yang Y, Shen F, Xu X (2016) Combining multi-representation for multimedia event detection using co-training. Neurocomputing 217(23):11–18

    Article  Google Scholar 

  10. Zheng Y, Capra L, Wolfson O, Yang H (2014) Urban computing: concepts. Methodol Appl Acm Trans Intell Syst Techno 5(3):1–55

    Google Scholar 

  11. Du J, Ling CX, Zhou ZH (2011) When does Cotraining work in real data. IEEE Trans Knowl Data Eng 23(5):788–799

    Article  Google Scholar 

  12. Xu C, Tao D, Xu C (2015) Multi-view intact space learning. IEEE Trans Pattern Anal Mach Intell 37(12):1–1

    Article  Google Scholar 

  13. Zhang ML, Zhou ZH (2011) COTRADE: confident co-training with data editing. IEEE Trans Syst Man, and Cybern Part B (Cybern) 41(6):1612–1626

    Article  Google Scholar 

  14. Angluin D, Laird PD (1988) Learning from Noisy examples. Mach Learn 2(4):343–370

    Google Scholar 

  15. Gan H, Sang N, Huang R, Dan Z (2013) Using clustering analysis to improve semi-supervised classification. Neurocomputing 25(3):290–298

    Article  Google Scholar 

  16. GONG YL, LU J (2019) Co-training method combined semi-supervised clustering and weighted K Nearest Neighbor. Comput Eng Appl 55(22):114–118l

    Google Scholar 

  17. GONG YL, LU J (2019) Co-training method combined active learning and density peaks clustering. Comput Appl 39(08):2297–2301

    Google Scholar 

  18. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  19. Sun S (2013) A survey of multi-view machine learning. Neural Comput & Applic 23(7–8):2031–2038

    Article  Google Scholar 

  20. Zhang Y, Wen J, Wang X, Jiang Z (2014) Semi-supervised learning combining co-training with active learning. Expert Syst Appl 41(5):2372–2378

    Article  Google Scholar 

  21. Liu ZY, Gao ZB, Li XL (2018) Co-training method based on margin data addition. Chin J Sci Instrum 39(03):45–53

    Google Scholar 

  22. Goldman S, Zhou Y (2000) Enhancing supervised learning with unlabeled data, Pcoceedings of the 17th International Conference on Machine Learning San Francisco 327–334

  23. Hady MFA, Schwenker F (2008) Co-training by committee: a new semi-supervised learning framework. IEEE Int Conf Data Min Workshops IEEE:563–572

  24. Blum A, Mitchell T (1998) Combining Labeled and Unlabeled Data with Co-Training, Proceedings of the 11th Annual Conference on Computational Learning Theory

  25. Shannon CE (1948) A mathematical theory of communication[J]. Bell Syst Tech J 27(4):379–423

    Article  MathSciNet  Google Scholar 

  26. Fard MM, Thonet T, Gaussier E (2018) Deep k-means: jointly clustering with k-means and learning representations. Pattern Recogn Lett 138(10):185–192

    Google Scholar 

  27. Khanmohammadi S, Adibeig N, Shanehbandy S (2017) An improved overlapping k-means clustering method for medical applications. Expert Syst Appl 67(1):12–18

    Article  Google Scholar 

  28. Zhu Q, Pei J, Liu XB, Zhou ZP (2019) Analyzing commercial aircraft fuel consumption during descent: a case study using an improved K-means clustering algorithm. J Clean Prod 223(12):869–882

    Article  Google Scholar 

  29. Liu G, Yang J, Hao Y, Zhang Y (2018) Big data-informed energy efficiency assessment of China industry sectors based on K-means clustering. J Clean Prod 183(9):304–314

    Article  Google Scholar 

  30. Abellán J, Castellano JG (2017) Improving the naive Bayes classifier via a quick variable selection method using maximum of entropy. Entropy 19(6):247–264

    Article  Google Scholar 

  31. Wang S, Wu L, Jiao L, Liu H (2014) Improve the performance of co-training by committee with refinement of class probability estimations. Neurocomputing 136(8):30–40

    Google Scholar 

  32. Dong LY, Sui P, Sun P, Li YL (2016) A new naive bayes classification algorithm based on semi-supervised learning. J Jilin Univ (Eng Edition) 46(3):884–889

    Google Scholar 

  33. Feng X, Li S, Yuan C, Zeng P, Sun Y (2018) Prediction of slope stability using naive Bayes classifier. KSCE J Civ Eng 22(3):941–950

    Article  Google Scholar 

  34. Nicholson T, Sambridge M, Gudmundsson Ó (2010) On entropy and clustering in earthquake hypocentre distributions. Geophys J R Astron Soc 142(1):37–51

    Article  Google Scholar 

  35. Wang Y, Chen S, Zhou ZH (2012) New semi-supervised classification method based on modified cluster assumption. IEEE Trans Neural Netw Learn Syst 23(5):689–702

    Article  Google Scholar 

  36. Piroonsup N, Sinthupinyo S (2018) Analysis of training data using clustering to improve semi-supervised self-training. Knowl-Based Syst 143(2):65–80

    Article  Google Scholar 

Download references

Funding

This work is supported by Chongqing University Innovation Research Group Founding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jia Lu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, J., Gong, Y. A co-training method based on entropy and multi-criteria. Appl Intell 51, 3212–3225 (2021). https://doi.org/10.1007/s10489-020-02014-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02014-6

Keywords

Navigation