当前位置: X-MOL 学术IEEE Trans. Vis. Comput. Graph. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PyramidTags: Context-, Time- and Word Order-Aware Tag Maps to Explore Large Document Collections.
IEEE Transactions on Visualization and Computer Graphics ( IF 5.2 ) Pub Date : 2021-10-26 , DOI: 10.1109/tvcg.2020.3010095
Johannes Knittel 1 , Steffen Koch 1 , Thomas Ertl 1
Affiliation  

It is difficult to explore large text collections if no or little information is available on the contained documents. Hence, starting analytic tasks on such corpora is challenging for many stakeholders from various domains. As a remedy, recent visualization research suggests to use visual spatializations of representative text documents or tags to explore text collections. With PyramidTags, we introduce a novel approach for summarizing large text collections visually. In contrast to previous work, PyramidTags in particular aims at creating an improved representation that incorporates both temporal evolution and semantic relationship of visualized tags within the summarized document collection. As a result, it equips analysts with a visual starting point for interactive exploration to not only get an overview of the main terms and phrases of the corpus, but also to grasp important ideas and stories. Analysts can hover and select multiple tags to explore relationships and retrieve the most relevant documents. In this work, we apply PyramidTags to hundreds of thousands of web-crawled news reports. Our benchmarks suggest that PyramidTags creates time- and context-aware layouts, while preserving the inherent word order of important pairs.

中文翻译:

PyramidTags:用于探索大型文档集合的上下文、时间和词序感知标签映射。

如果所包含的文档没有可用信息或信息很少,则很难探索大型文本集合。因此,对于来自各个领域的许多利益相关者来说,在此类语料库上启动分析任务具有挑战性。作为补救措施,最近的可视化研究建议使用代表性文本文档或标签的视觉空间化来探索文本集合。通过 PyramidTags,我们引入了一种新颖的方法来直观地总结大型文本集合。与之前的工作相比,PyramidTags 特别旨在创建一种改进的表示,该表示将时间演变和可视化标签在汇总文档集合中的语义关系结合起来。其结果,它为分析人员提供了交互式探索的视觉起点,不仅可以概览语料库的主要术语和短语,还可以掌握重要的想法和故事。分析师可以悬停并选择多个标签来探索关系并检索最相关的文档。在这项工作中,我们将 PyramidTags 应用于数十万个网络抓取的新闻报道。我们的基准测试表明 PyramidTags 创建时间和上下文感知布局,同时保留重要对的固有词序。
更新日期:2020-07-17
down
wechat
bug