当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A co-training method based on entropy and multi-criteria
Applied Intelligence ( IF 5.3 ) Pub Date : 2020-11-10 , DOI: 10.1007/s10489-020-02014-6
Jia Lu , Yanlu Gong

Co-training method is a branch of semi-supervised learning, which improves the performance of classifier through the complementary effect of two views. In co-training algorithm, the selection of unlabeled data often adopts the high confidence degree strategy. Obviously, the higher confidence of data signifies the higher accuracy of prediction. Unfortunately, high confidence selection strategy is not always effective in improving classifier performance. In this paper, a co-training method based on entropy and multi-criteria is proposed. Firstly, the data set is divided into two views with the same amount of information by entropy. Then, the clustering criterion and confidence criterion are adopted to select unlabeled data in view 1 and view 2, respectively. It can solve the problem that high confidence criterion is not always valid. Different choices can better play the complementary role of co-training, thus supplement what the other view does not have. In addition, the role of labeled data is fully considered in multi-criteria in order to select more valuable unlabeled data. Experimental results on several UCI data sets and one artificial data set show the effectiveness of the proposed algorithm.



中文翻译:

基于熵和多准则的协同训练方法

协同训练方法是半监督学习的一个分支,它通过两种观点的互补作用来提高分类器的性能。在协同训练算法中,未标记数据的选择通常采用高置信度策略。显然,数据的较高置信度表示预测的准确性较高。不幸的是,高置信度选择策略并不总是有效地提高分类器性能。提出了一种基于熵和多准则的协同训练方法。首先,通过熵将数据集分为两个具有相同信息量的视图。然后,采用聚类准则和置信准则分别在视图1和视图2中选择未标记的数据。它可以解决高置信度准则并不总是有效的问题。不同的选择可以更好地发挥共同训练的补充作用,从而补充另一种观点所没有的东西。此外,在多准则中要充分考虑标记数据的作用,以便选择更有价值的未标记数据。在几个UCI数据集和一个人工数据集上的实验结果证明了该算法的有效性。

更新日期:2020-11-12
down
wechat
bug