当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Threatening Language Detection and Target Identification in Urdu Tweets
IEEE Access ( IF 3.4 ) Pub Date : 2021-09-14 , DOI: 10.1109/access.2021.3112500
Maaz Amjad 1 , Noman Ashraf 1 , Alisa Zhila , Grigori Sidorov 1 , Arkaitz Zubiaga 2 , Alexander Gelbukh 1
Affiliation  

Top-k recommendation is a fundamental task in recommendation systems that is generally learned by comparing positive and negative pairs. The contrastive loss (CL) is the key in contrastive learning that has recently received more attention, and we find that it is well suited for top-k recommendations. However, CL is problematic because it treats the importance of the positive and negative samples the same. On the one hand, CL faces the imbalance problem of one positive sample and many negative samples. On the other hand, there are so few positive items in sparser datasets that their importance should be emphasized. Moreover, the other important issue is that the sparse positive items are still not sufficiently utilized in recommendations. Consequently, we propose a new data augmentation method by using multiple positive items (or samples) simultaneously with the CL loss function. Therefore, we propose a multisample-based contrastive loss (MSCL) function that solves the two problems by balancing the importance of positive and negative samples and data augmentation. Based on the graph convolution network (GCN) method, experimental results demonstrate the state-of-the-art performance of MSCL. The proposed MSCL is simple and can be applied in many methods. Our code is available at https://github.com/haotangxjtu/MSCL.

中文翻译:


乌尔都语推文中的威胁语言检测和目标识别



Top-k 推荐是推荐系统中的一项基本任务,通常通过比较正负对来学习。对比损失(CL)是对比学习中的关键,最近受到更多关注,我们发现它非常适合 top-k 推荐。然而,CL 是有问题的,因为它将正样本和负样本的重要性视为相同。一方面,CL面临一个正样本和多个负样本的不平衡问题。另一方面,稀疏数据集中的积极项很少,因此应该强调它们的重要性。此外,另一个重要问题是稀疏的积极项在推荐中仍然没有得到充分利用。因此,我们提出了一种新的数据增强方法,通过同时使用多个正项(或样本)和 CL 损失函数。因此,我们提出了一种基于多样本的对比损失(MSCL)函数,通过平衡正负样本和数据增强的重要性来解决这两个问题。基于图卷积网络(GCN)方法,实验结果证明了 MSCL 的最先进性能。所提出的 MSCL 很简单,可以应用于许多方法。我们的代码可在 https://github.com/haotangxjtu/MSCL 获取。
更新日期:2021-09-14
down
wechat
bug