High-dimensional data analysis with subspace comparison using matrix visualization,Information Visualization

当前位置： X-MOL 学术 › Inf. Visualization › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

High-dimensional data analysis with subspace comparison using matrix visualization
Information Visualization ( IF 1.8 ) Pub Date : 2017-10-14 , DOI: 10.1177/1473871617733996
Junpeng Wang ₁ , Xiaotong Liu ₂ , Han-Wei Shen ₁

Affiliation

Due to the intricate relationship between different dimensions of high-dimensional data, subspace analysis is often conducted to decompose dimensions and give prominence to certain subsets of dimensions, i.e. subspaces. Exploring and comparing subspaces are important to reveal the underlying features of subspaces, as well as to portray the characteristics of individual dimensions. To date, most of the existing high-dimensional data exploration and analysis approaches rely on dimensionality reduction algorithms (e.g. principal component analysis and multi-dimensional scaling) to project high-dimensional data, or their subspaces, to two-dimensional space and employ scatterplots for visualization. However, the dimensionality reduction algorithms are sometimes difficult to fine-tune and scatterplots are not effective for comparative visualization, making subspace comparison hard to perform. In this article, we aggregate high-dimensional data or their subspaces by computing pair-wise distances between all data items and showing the distances with matrix visualizations to present the original high-dimensional data or subspaces. Our approach enables effective visual comparisons among subspaces, which allows users to further investigate the characteristics of individual dimensions by studying their behaviors in similar subspaces. Through subspace comparisons, we identify dominant, similar, and conforming dimensions in different subspace contexts of synthetic and real-world high-dimensional data sets. Additionally, we present a prototype that integrates parallel coordinates plot and matrix visualization for high-dimensional data exploration and incremental dimensionality analysis, which also allows users to further validate the dimension characterization results derived from the subspace comparisons.

中文翻译：

使用矩阵可视化进行子空间比较的高维数据分析

由于高维数据的不同维度之间存在错综复杂的关系，通常进行子空间分析以分解维度，突出维度的某些子集，即子空间。探索和比较子空间对于揭示子空间的潜在特征以及描绘单个维度的特征很重要。迄今为止，大多数现有的高维数据探索和分析方法都依赖于降维算法（例如主成分分析和多维缩放）将高维数据或其子空间投影到二维空间并使用散点图用于可视化。然而，降维算法有时难以微调，散点图对于比较可视化无效，使子空间比较难以执行。在本文中，我们通过计算所有数据项之间的成对距离并使用矩阵可视化显示距离来呈现原始高维数据或子空间来聚合高维数据或其子空间。我们的方法可以在子空间之间进行有效的视觉比较，这允许用户通过研究他们在相似子空间中的行为来进一步研究各个维度的特征。通过子空间比较，我们在合成和现实世界高维数据集的不同子空间上下文中识别主导、相似和一致的维度。此外，我们提出了一个集成平行坐标图和矩阵可视化的原型，用于高维数据探索和增量维数分析，

更新日期：2017-10-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11