当前位置: X-MOL 学术Stat. Anal. Data Min. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Parallel coordinate order for high-dimensional data
Statistical Analysis and Data Mining ( IF 2.1 ) Pub Date : 2021-08-20 , DOI: 10.1002/sam.11543
Shaima Tilouche 1 , Vahid Partovi Nia 1, 2 , Samuel Bassetto 1
Affiliation  

Visualization of high-dimensional data is counter-intuitive using conventional graphs. Parallel coordinates are proposed as an alternative to explore multivariate data more effectively. However, it is difficult to extract relevant information through the parallel coordinates when the data are high-dimensional with thousands of overlapping lines. The order of the axes determines the perception of information on parallel coordinates. Thus, the information between attributes remains hidden if coordinates are improperly ordered. Here we propose a general framework to reorder the coordinates. This framework is general enough to cover a wide range of data visualization objectives. It is also flexible enough to contain many conventional ordering measures. Consequently, we present the coordinate ordering binary optimization problem and enhance it to achieve a computationally efficient greedy approach that suits high-dimensional data. Our approach is applied to wine data and genetic data. The purpose of dimension reordering of wine data is to highlight attributes' dependence. Genetic data are reordered to enhance cluster detection. The proposed framework shows that it is able to adapt the criteria for the visualization objective.

中文翻译:

高维数据的平行坐标顺序

使用传统图形对高维数据进行可视化是违反直觉的。平行坐标被提议作为更有效地探索多元数据的替代方法。然而,当数据是高维的,有数千条重叠线时,很难通过平行坐标提取相关信息。轴的顺序决定了对平行坐标信息的感知。因此,如果坐标排序不正确,属性之间的信息将保持隐藏。在这里,我们提出了一个重新排序坐标的通用框架。这个框架足够通用,可以涵盖广泛的数据可视化目标。它也足够灵活,可以包含许多传统的订购措施。最后,我们提出了坐标排序二进制优化问题并对其进行了增强,以实现适合高维数据的计算高效的贪婪方法。我们的方法应用于葡萄酒数据和基因数据。葡萄酒数据维度重新排序的目的是突出属性的依赖性。遗传数据被重新排序以增强聚类检测。所提出的框架表明它能够适应可视化目标的标准。
更新日期:2021-09-16
down
wechat
bug