当前位置: X-MOL 学术ACM Comput. Surv. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Survey on Semi-supervised Learning for Delayed Partially Labelled Data Streams
ACM Computing Surveys ( IF 16.6 ) Pub Date : 2022-11-21 , DOI: 10.1145/3523055
Heitor Murilo Gomes 1 , Maciej Grzenda 2 , Rodrigo Mello 3 , Jesse Read 4 , Minh Huong Le Nguyen 5 , Albert Bifet 1
Affiliation  

Unlabelled data appear in many domains and are particularly relevant to streaming applications, where even though data is abundant, labelled data is rare. To address the learning problems associated with such data, one can ignore the unlabelled data and focus only on the labelled data (supervised learning); use the labelled data and attempt to leverage the unlabelled data (semi-supervised learning); or assume some labels will be available on request (active learning). The first approach is the simplest, yet the amount of labelled data available will limit the predictive performance. The second relies on finding and exploiting the underlying characteristics of the data distribution. The third depends on an external agent to provide the required labels in a timely fashion. This survey pays special attention to methods that leverage unlabelled data in a semi-supervised setting. We also discuss the delayed labelling issue, which impacts both fully supervised and semi-supervised methods. We propose a unified problem setting, discuss the learning guarantees and existing methods, and explain the differences between related problem settings. Finally, we review the current benchmarking practices and propose adaptations to enhance them.



中文翻译:

延迟部分标记数据流的半监督学习调查

未标记的数据出现在许多领域,并且与流应用程序特别相关,即使数据丰富,但标记的数据却很少见。为了解决与此类数据相关的学习问题,可以忽略未标记的数据而只关注标记的数据(监督学习);使用标记数据并尝试利用未标记数据(半监督学习);或者假设一些标签将根据要求提供(主动学习)。第一种方法最简单,但可用的标记数据量会限制预测性能。第二个依赖于发现和利用数据分布的潜在特征。第三种依赖于外部代理及时提供所需的标签。该调查特别关注在半监督环境中利用未标记数据的方法。我们还讨论了延迟标记问题,它会影响全监督和半监督方法。我们提出了一个统一的问题设置,讨论了学习保证和现有方法,并解释了相关问题设置之间的差异。最后,我们回顾了当前的基准测试实践并提出了调整以增强它们。

更新日期:2022-11-21
down
wechat
bug