当前位置: X-MOL 学术Mach. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Conditional t-SNE: more informative t-SNE embeddings
Machine Learning ( IF 7.5 ) Pub Date : 2020-12-06 , DOI: 10.1007/s10994-020-05917-0
Bo Kang 1 , Darío García García 2 , Jefrey Lijffijt 1 , Raúl Santos-Rodríguez 3 , Tijl De Bie 1
Affiliation  

Dimensionality reduction and manifold learning methods such as t-distributed stochastic neighbor embedding (t-SNE) are frequently used to map high-dimensional data into a two-dimensional space to visualize and explore that data. Going beyond the specifics of t-SNE, there are two substantial limitations of any such approach: (1) not all information can be captured in a single two-dimensional embedding, and (2) to well-informed users, the salient structure of such an embedding is often already known, preventing that any real new insights can be obtained. Currently, it is not known how to extract the remaining information in a similarly effective manner. We introduce conditional t-SNE (ct-SNE), a generalization of t-SNE that discounts prior information in the form of labels. This enables obtaining more informative and more relevant embeddings. To achieve this, we propose a conditioned version of the t-SNE objective, obtaining an elegant method with a single integrated objective. We show how to efficiently optimize the objective and study the effects of the extra parameter that ct-SNE has over t-SNE. Qualitative and quantitative empirical results on synthetic and real data show ct-SNE is scalable, effective, and achieves its goal: it allows complementary structure to be captured in the embedding and provided new insights into real data.

中文翻译:

条件 t-SNE:更多信息的 t-SNE 嵌入

降维和流形学习方法(例如 t 分布随机邻域嵌入 (t-SNE))经常用于将高维数据映射到二维空间以可视化和探索该数据。除了 t-SNE 的细节之外,任何此类方法都存在两个实质性限制:(1)并非所有信息都可以在单个二维嵌入中捕获,(2)对于消息灵通的用户,这种嵌入通常是已知的,因此无法获得任何真正的新见解。目前,不知道如何以同样有效的方式提取剩余信息。我们引入了条件 t-SNE (ct-SNE),它是 t-SNE 的一种概括,它以标签的形式折扣先验信息。这使得获得更多信息和更相关的嵌入成为可能。为了实现这一点,我们提出了 t-SNE 目标的条件版本,获得了具有单个集成目标的优雅方法。我们展示了如何有效地优化目标并研究 ct-SNE 对 t-SNE 的额外参数的影响。合成和真实数据的定性和定量实证结果表明 ct-SNE 是可扩展的、有效的,并实现了其目标:它允许在嵌入中捕获互补结构,并为真实数据提供新的见解。
更新日期:2020-12-06
down
wechat
bug