Increasing Interpretation of Web Topic Detection via Prototype Learning From Sparse Poisson Deconvolution,IEEE Transactions on Cybernetics

当前位置： X-MOL 学术 › IEEE Trans. Cybern. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Increasing Interpretation of Web Topic Detection via Prototype Learning From Sparse Poisson Deconvolution
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 2-2-2018 , DOI: 10.1109/tcyb.2018.2795015
Junbiao Pang , Anjing Hu , Qingming Huang , Qi Tian , Baocai Yin

Organizing webpages into interesting topics is one of the key steps to understand the trends from multimodal Web data. The sparse, noisy, and less-constrained user-generated content results in inefficient feature representations. These descriptors unavoidably cause that a detected topic still contains a certain number of the false detected webpages, which further make a topic be less coherent, less interpretable, and less useful. In this paper, we address this problem from a viewpoint interpreting a topic by its prototypes, and present a two-step approach to achieve this goal. Following the detection-by-ranking approach, a sparse Poisson deconvolution is proposed to learn the intratopic similarities between webpages. To find the prototypes, leveraging the intratopic similarities, top- ${k}$ diverse yet representative prototype webpages are identified from a submodularity function. Experimental results not only show the improved accuracies for the Web topic detection task, but also increase the interpretation of a topic by its prototypes on two public datasets.

中文翻译：

通过稀疏泊松反卷积的原型学习增加对网络主题检测的解释

将网页组织成有趣的主题是了解多模式 Web 数据趋势的关键步骤之一。用户生成的内容稀疏、嘈杂且约束较少，导致特征表示效率低下。这些描述符不可避免地导致检测到的主题仍然包含一定数量的错误检测的网页，这进一步使得主题不太连贯、不太可解释且不太有用。在本文中，我们从通过原型解释主题的角度来解决这个问题，并提出了实现这一目标的两步方法。按照排名检测方法，提出了稀疏泊松反卷积来学习网页之间的主题内相似性。为了找到原型，利用主题内的相似性，从子模块函数中识别出最多样化但具有代表性的原型网页。实验结果不仅表明网络主题检测任务的准确性得到了提高，而且还通过两个公共数据集上的原型增加了对主题的解释。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11