当前位置: X-MOL 学术IEEE Trans. Inform. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sequential Classification With Empirically Observed Statistics
IEEE Transactions on Information Theory ( IF 2.2 ) Pub Date : 2021-02-13 , DOI: 10.1109/tit.2021.3059272
Mahdi Haghifam 1 , Vincent Y. F. Tan 2 , Ashish Khisti 1
Affiliation  

Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of empirically sampled sequences are available to a decision maker. The decision maker is tasked to classify a test sequence which is known to be generated according to either one of the distributions. In particular, for the binary case, the decision maker wishes to perform the classification task with minimum number of the test samples, so, at each step, she declares that either hypothesis 1 is true, hypothesis 2 is true, or she requests for an additional test sample. We propose a classifier and analyze the type-I and type-II error probabilities. We demonstrate the significant advantage of our sequential scheme compared to an existing non-sequential classifier proposed by Gutman. Finally, we extend our setup and results to the multi-class classification scenario and again demonstrate that the variable-length nature of the problem affords significant advantages as one can achieve the same set of exponents as Gutman's fixed-length setting but without having the rejection option.

中文翻译:


通过经验观察统计进行顺序分类



受现实世界机器学习应用程序的推动,我们考虑在测试样本按顺序到达的顺序设置中执行统计分类任务。此外,生成分布是未知的,决策者只能使用一组经验采样序列。决策者的任务是对已知根据任一分布生成的测试序列进行分类。特别是,对于二元情况,决策者希望以最少数量的测试样本执行分类任务,因此,在每一步,她声明假设 1 为真,假设 2 为真,或者她要求额外的测试样本。我们提出了一个分类器并分析了 I 类和 II 类错误概率。与古特曼提出的现有非序列分类器相比,我们证明了我们的序列方案的显着优势。最后,我们将我们的设置和结果扩展到多类分类场景,并再次证明问题的可变长度性质提供了显着的优势,因为我们可以获得与古特曼的固定长度设置相同的一组指数,但不会被拒绝选项。
更新日期:2021-02-13
down
wechat
bug