当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
User Factor Adaptation for User Embedding via Multitask Learning
arXiv - CS - Computation and Language Pub Date : 2021-02-22 , DOI: arxiv-2102.11103
Xiaolei Huang, Michael J. Paul, Robin Burke, Franck Dernoncourt, Mark Dredze

Language varies across users and their interested fields in social media data: words authored by a user across his/her interests may have different meanings (e.g., cool) or sentiments (e.g., fast). However, most of the existing methods to train user embeddings ignore the variations across user interests, such as product and movie categories (e.g., drama vs. action). In this study, we treat the user interest as domains and empirically examine how the user language can vary across the user factor in three English social media datasets. We then propose a user embedding model to account for the language variability of user interests via a multitask learning framework. The model learns user language and its variations without human supervision. While existing work mainly evaluated the user embedding by extrinsic tasks, we propose an intrinsic evaluation via clustering and evaluate user embeddings by an extrinsic task, text classification. The experiments on the three English-language social media datasets show that our proposed approach can generally outperform baselines via adapting the user factor.

中文翻译:

通过多任务学习进行用户嵌入的用户因子自适应

语言在社交媒体数据中的用户及其感兴趣的字段之间会有所不同:用户根据其兴趣创作的单词可能具有不同的含义(例如,很酷)或情绪(例如,很快)。但是,训练用户嵌入的大多数现有方法都忽略了跨用户兴趣的差异,例如产品和电影类别(例如,戏剧与动作)。在这项研究中,我们将用户兴趣视为领域,并通过经验检查了三种英语社交媒体数据集中用户语言如何随用户因素而变化。然后,我们提出了一种用户嵌入模型,以通过多任务学习框架解决用户兴趣的语言变异性。该模型无需人工监督即可学习用户语言及其变体。现有工作主要通过外部任务评估用户嵌入,我们建议通过聚类进行内在评估,并通过外部任务文本分类来评估用户嵌入。在三个英语社交媒体数据集上的实验表明,我们提出的方法通过适应用户因素通常可以胜过基线。
更新日期:2021-02-23
down
wechat
bug