当前位置: X-MOL 学术J. Appl. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification of histogram-valued data with support histogram machines
Journal of Applied Statistics ( IF 1.2 ) Pub Date : 2021-07-01 , DOI: 10.1080/02664763.2021.1947996
Ilsuk Kang 1 , Cheolwoo Park 2 , Young Joo Yoon 3 , Changyi Park 4 , Soon-Sun Kwon 5 , Hosik Choi 6
Affiliation  

The current large amounts of data and advanced technologies have produced new types of complex data, such as histogram-valued data. The paper focuses on classification problems when predictors are observed as or aggregated into histograms. Because conventional classification methods take vectors as input, a natural approach converts histograms into vector-valued data using summary values, such as the mean or median. However, this approach forgoes the distributional information available in histograms. To address this issue, we propose a margin-based classifier called support histogram machine (SHM) for histogram-valued data. We adopt the support vector machine framework and the Wasserstein-Kantorovich metric to measure distances between histograms. The proposed optimization problem is solved by a dual approach. We then test the proposed SHM via simulated and real examples and demonstrate its superior performance to summary-value-based methods.



中文翻译:

支持直方图机的直方图值数据分类

当前大量的数据和先进的技术产生了新型的复杂数据,例如直方图值数据。本文侧重于将预测变量观察为直方图或将其聚合为直方图时的分类问题。由于传统的分类方法将向量作为输入,自然的方法是使用平均值或中位数等汇总值将直方图转换为向量值数据。然而,这种方法放弃了直方图中可用的分布信息。为了解决这个问题,我们针对直方图值数据提出了一种称为支持直方图机 (SHM) 的基于边距的分类器。我们采用支持向量机框架和 Wasserstein-Kantorovich 度量来测量直方图之间的距离。所提出的优化问题通过对偶方法解决。

更新日期:2021-07-01
down
wechat
bug