Learning Sentence-to-Hashtags Semantic Mapping for Hashtag Recommendation on Microblogs,ACM Transactions on Knowledge Discovery from Data

当前位置： X-MOL 学术 › ACM Trans. Knowl. Discov. Data › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning Sentence-to-Hashtags Semantic Mapping for Hashtag Recommendation on Microblogs
ACM Transactions on Knowledge Discovery from Data ( IF 4.0 ) Pub Date : 2021-09-04 , DOI: 10.1145/3466876
Riccardo Cantini ₁ , Fabrizio Marozzo ₁ , Giovanni Bruno ₁ , Paolo Trunfio ₁

Affiliation

The growing use of microblogging platforms is generating a huge amount of posts that need effective methods to be classified and searched. In Twitter and other social media platforms, hashtags are exploited by users to facilitate the search, categorization, and spread of posts. Choosing the appropriate hashtags for a post is not always easy for users, and therefore posts are often published without hashtags or with hashtags not well defined. To deal with this issue, we propose a new model, called HASHET ( HAshtag recommendation using Sentence-to-Hashtag Embedding Translation ), aimed at suggesting a relevant set of hashtags for a given post. HASHET is based on two independent latent spaces for embedding the text of a post and the hashtags it contains. A mapping process based on a multi-layer perceptron is then used for learning a translation from the semantic features of the text to the latent representation of its hashtags. We evaluated the effectiveness of two language representation models for sentence embedding and tested different search strategies for semantic expansion, finding out that the combined use of BERT ( Bidirectional Encoder Representation from Transformer ) and a global expansion strategy leads to the best recommendation results. HASHET has been evaluated on two real-world case studies related to the 2016 United States presidential election and COVID-19 pandemic. The results reveal the effectiveness of HASHET in predicting one or more correct hashtags, with an average F -score up to 0.82 and a recommendation hit-rate up to 0.92. Our approach has been compared to the most relevant techniques used in the literature ( generative models , unsupervised models, and attention-based supervised models ) by achieving up to 15% improvement in F -score for the hashtag recommendation task and 9% for the topic discovery task.

中文翻译：

学习 Sentence-to-Hashtags 语义映射用于微博上的 Hashtag 推荐

越来越多的微博平台使用产生了大量需要有效方法进行分类和搜索的帖子。在 Twitter 和其他社交媒体平台中，用户利用主题标签来促进帖子的搜索、分类和传播。为帖子选择合适的主题标签对用户来说并不总是那么容易，因此发布的帖子通常没有主题标签或未明确定义的主题标签。为了解决这个问题，我们提出了一个新模型，称为 HASHET (使用句子到标签嵌入翻译的标签推荐)，旨在为给定的帖子建议一组相关的主题标签。HASHET 基于两个独立的潜在空间，用于嵌入帖子的文本及其包含的主题标签。然后使用基于多层感知器的映射过程来学习从文本的语义特征到其标签的潜在表示的翻译。我们评估了两种语言表示模型对句子嵌入的有效性，并测试了不同的语义扩展搜索策略，发现结合使用 BERT (来自 Transformer 的双向编码器表示) 和全球扩张策略带来最好的推荐结果。HASHET 在与 2016 年美国总统大选和 COVID-19 大流行相关的两个真实案例研究中进行了评估。结果揭示了 HASHET 在预测一个或多个正确标签方面的有效性，平均F- 得分高达 0.82，推荐命中率高达 0.92。我们的方法已与文献中使用的最相关技术进行了比较（生成模型,无监督模型，和基于注意力的监督模型) 通过实现高达 15% 的改进F- 主题标签推荐任务得分和主题发现任务得分 9%。

更新日期：2021-09-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11