当前位置: X-MOL 学术Intell. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification from positive and unlabeled data based on likelihood invariance for measurement
Intelligent Data Analysis ( IF 0.9 ) Pub Date : 2021-01-26 , DOI: 10.3233/ida-194980
Takeshi Yoshida , Takashi Washio , Takahito Ohshiro , Masateru Taniguchi

Abstract

We propose novel approaches for classification from positive and unlabeled data (PUC) based on maximum likelihood principle. These are particularly suited to measurement tasks in which the class prior of the target object in each measurement is unknown and significantly different from the class prior used for training, while the likelihood function representing the observation process is invariant over the training and measurement stages. Our PUCs effectively work without estimating the class priors of the unlabeled objects. First, we present a PUC approach called Naive Likelihood PUC (NL-PUC) using the maximum likelihood principle in a nontrivial but rather straightforward manner. The extended version called Enhanced Likelihood PUC (EL-PUC) employs an algorithm iteratively improving the likelihood estimation of the positive class. This is advantageous when the availability of the labeled positive data is limited. These characteristics are demonstrated both theoretically and experimentally. Moreover, the practicality of our PUCs is demonstrated in a real application to single molecule measurement.



中文翻译:

根据测量的似然不变性从阳性和未标记数据分类

摘要

我们提出了基于最大似然原理从阳性和未标记数据(PUC)进行分类的新颖方法。这些特别适合于以下测量任务:每次测量中目标对象的类别优先级是未知的,并且与用于训练的类别优先级显着不同,而代表观察过程的似然函数在训练和测量阶段是不变的。我们的PUC可以有效地工作,而无需估计未标记对象的类优先级。首先,我们提出一种称为朴素似然PUC(NL-PUC)的PUC方法,该方法使用最大似然原理以非平凡但相当简单的方式。称为增强型可能性PUC(EL-PUC)的扩展版本采用了一种算法,该算法可迭代地改善正类的似然估计。当标记的阳性数据的可用性受到限制时,这是有利的。这些特性在理论上和实验上都得到了证明。此外,我们的PUC的实用性已在单分子测量的实际应用中得到证明。

更新日期:2021-02-03
down
wechat
bug