当前位置: X-MOL 学术Adv. Data Anal. Classif. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A fingerprint of a heterogeneous data set
Advances in Data Analysis and Classification ( IF 1.4 ) Pub Date : 2021-07-03 , DOI: 10.1007/s11634-021-00452-9
Matteo Spallanzani 1 , Gueorgui Mihaylov 2, 3 , Marco Prato 4 , Roberto Fontana 5
Affiliation  

In this paper, we describe the fingerprint method, a technique to classify bags of mixed-type measurements. The method was designed to solve a real-world industrial problem: classifying industrial plants (individuals at a higher level of organisation) starting from the measurements collected from their production lines (individuals at a lower level of organisation). In this specific application, the categorical information attached to the numerical measurements induced simple mixture-like structures on the global multivariate distributions associated with different classes. The fingerprint method is designed to compare the mixture components of a given test bag with the corresponding mixture components associated with the different classes, identifying the most similar generating distribution. When compared to other classification algorithms applied to several synthetic data sets and the original industrial data set, the proposed classifier showed remarkable improvements in performance.



中文翻译:

异构数据集的指纹

在本文中,我们描述了指纹法,这是一种对混合类型测量袋进行分类的技术。该方法旨在解决现实世界的工业问题:从从生产线(组织级别较低的个人)收集的测量数据开始,对工厂(组织级别较高的个人)进行分类。在这个特定的应用程序中,附加到数值测量的分类信息在与不同类别相关的全局多元分布上产生了简单的类似混合物的结构。该指纹方法旨在将给定测试包的混合成分与与不同类别相关联的相应混合成分进行比较,从而确定最相似的生成分布。与应用于多个合成数据集和原始工业数据集的其他分类算法相比,所提出的分类器在性能上表现出显着的改进。

更新日期:2021-07-04
down
wechat
bug