Exploring Multi-dimensional Data via Subset Embedding,Computer Graphics Forum

当前位置： X-MOL 学术 › Comput. Graph. Forum › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Exploring Multi-dimensional Data via Subset Embedding
Computer Graphics Forum ( IF 2.5 ) Pub Date : 2021-06-29 , DOI: 10.1111/cgf.14290
Peng Xie ₁ , Wenyuan Tao ₁ , Jie Li ₁ , Wentao Huang ₁ , Siming Chen ₂

Affiliation

Multi-dimensional data exploration is a classic research topic in visualization. Most existing approaches are designed for identifying record patterns in dimensional space or subspace. In this paper, we propose a visual analytics approach to exploring subset patterns. The core of the approach is a subset embedding network (SEN) that represents a group of subsets as uniformly-formatted embeddings. We implement the SEN as multiple subnets with separate loss functions. The design enables to handle arbitrary subsets and capture the similarity of subsets on single features, thus achieving accurate pattern exploration, which in most cases is searching for subsets having similar values on few features. Moreover, each subnet is a fully-connected neural network with one hidden layer. The simple structure brings high training efficiency. We integrate the SEN into a visualization system that achieves a 3-step workflow. Specifically, analysts (1) partition the given dataset into subsets, (2) select portions in a projected latent space created using the SEN, and (3) determine the existence of patterns within selected subsets. Generally, the system combines visualizations, interactions, automatic methods, and quantitative measures to balance the exploration flexibility and operation efficiency, and improve the interpretability and faithfulness of the identified patterns. Case studies and quantitative experiments on multiple open datasets demonstrate the general applicability and effectiveness of our approach.

中文翻译：

通过子集嵌入探索多维数据

多维数据探索是可视化领域的经典研究课题。大多数现有方法旨在识别维度空间或子空间中的记录模式。在本文中，我们提出了一种可视化分析方法来探索子集模式。该方法的核心是一个子集嵌入网络 (SEN)，它将一组子集表示为统一格式的嵌入。我们将 SEN 实现为具有单独损失函数的多个子网。该设计能够处理任意子集并捕获单个特征上子集的相似性，从而实现准确的模式探索，这在大多数情况下是在少数特征上搜索具有相似值的子集。此外，每个子网都是一个带有一个隐藏层的全连接神经网络。结构简单，训练效率高。我们将 SEN 集成到一个实现 3 步工作流程的可视化系统中。具体来说，分析师 (1) 将给定的数据集划分为子集，(2) 在使用 SEN 创建的投影潜在空间中选择部分，以及 (3) 确定所选子集中模式的存在。通常，系统结合可视化、交互、自动化方法和量化措施来平衡探索灵活性和操作效率，提高识别模式的可解释性和忠实度。对多个开放数据集的案例研究和定量实验证明了我们方法的普遍适用性和有效性。(2) 在使用 SEN 创建的投影潜在空间中选择部分，以及 (3) 确定所选子集中模式的存在。通常，系统结合可视化、交互、自动化方法和量化措施来平衡探索灵活性和操作效率，提高识别模式的可解释性和忠实度。对多个开放数据集的案例研究和定量实验证明了我们方法的普遍适用性和有效性。(2) 在使用 SEN 创建的投影潜在空间中选择部分，以及 (3) 确定所选子集中模式的存在。通常，系统结合可视化、交互、自动方法和量化措施来平衡探索灵活性和操作效率，提高识别模式的可解释性和忠实度。对多个开放数据集的案例研究和定量实验证明了我们方法的普遍适用性和有效性。并提高已识别模式的可解释性和忠实度。对多个开放数据集的案例研究和定量实验证明了我们方法的普遍适用性和有效性。并提高已识别模式的可解释性和忠实度。对多个开放数据集的案例研究和定量实验证明了我们方法的普遍适用性和有效性。

更新日期：2021-06-29

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>