当前位置: X-MOL 学术Inf. Visualization › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Visual feature fusion and its application to support unsupervised clustering tasks
Information Visualization ( IF 1.8 ) Pub Date : 2019-12-17 , DOI: 10.1177/1473871619891062
Gladys M Hilasaca 1 , Fernando V Paulovich 2
Affiliation  

The concept of involving users in the loop of analytic workflows refers to the ability to replace heuristics with user input in machine learning and data mining tasks. For supervised tasks, user engagement generally occurs via the manipulation of training data. But for unsupervised tasks, user involvement is limited to changes in the algorithm parametrization or the input data representation, also known as features. Typically, different types of features can be extracted from raw data, and the careful selection of the extraction strategy allows users to have more control over unsupervised tasks. Nevertheless, since there is no perfect feature extractor, the combination of multiple sets of features has been explored through a process called feature fusion. Feature fusion can be readily performed when the machine learning or data mining algorithms have a cost function, such as accuracy for classification tasks. However, when such a function does not exist, user support needs to be provided, otherwise the process is impractical. In this article, we present a novel feature fusion approach that employs data samples and visualization to allow users to not only effortlessly control the combination of different feature sets but also understand the attained results. The effectiveness of our approach is confirmed by a comprehensive set of qualitative and quantitative experiments, opening up different possibilities for user-guided analytical scenarios. The ability of our approach to provide real-time feedback for feature fusion is exploited in the context of unsupervised clustering techniques, where users can perform an exploratory process to discover the best combination of features that reflects their individual perceptions about similarity.

中文翻译:

视觉特征融合及其在支持无监督聚类任务中的应用

让用户参与分析工作流循环的概念是指在机器学习和数据挖掘任务中用用户输入替换启发式方法的能力。对于受监督的任务,用户参与通常通过操作训练数据发生。但对于无监督任务,用户参与仅限于算法参数化或输入数据表示(也称为特征)的变化。通常,可以从原始数据中提取不同类型的特征,并且仔细选择提取策略可以让用户对无监督任务有更多的控制权。然而,由于没有完美的特征提取器,多组特征的组合已经通过称为特征融合的过程进行了探索。当机器学习或数据挖掘算法具有成本函数(例如分类任务的准确性)时,可以轻松执行特征融合。但是,当这样的功能不存在时,需要提供用户支持,否则这个过程是不切实际的。在本文中,我们提出了一种新颖的特征融合方法,该方法采用数据样本和可视化,使用户不仅可以轻松控制不同特征集的组合,还可以了解所获得的结果。我们方法的有效性得到了一组全面的定性和定量实验的证实,为用户引导的分析场景开辟了不同的可能性。我们的方法为特征融合提供实时反馈的能力在无监督聚类技术的背景下被利用,
更新日期:2019-12-17
down
wechat
bug