当前位置: X-MOL 学术Lang. Resour. Eval. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Orthographic features for emotion classification in Chinese in informal short texts
Language Resources and Evaluation ( IF 2.7 ) Pub Date : 2020-11-23 , DOI: 10.1007/s10579-020-09515-3
I-Hsuan Chen , Yunfei Long , Qin Lu , Chu-Ren Huang

Informal short texts on the web are rich in emotions as they often reflect unfiltered immediate reactions to breaking news events. The emotion density, however, stands in contrast to its poverty of linguistic contexts and features for emotion classification. This paper tackles that challenge by proposing orthographic features based on orthographic code mixing and code-switching for both non-ML and ML approaches. Our results show that orthographic features routinely outperform grammatical features for emotion classification for short texts in all approaches as expected. Orthographic features were also shown to make more significant contributions, especially in terms of precision and in formal texts when state of the art deep learning algorithms are applied. This result confirms the effectiveness of the orthographic change feature to the task of emotion classification. These results are argued to be applicable to all languages because of the common code-shifting in languages with non-Latin orthographies, and the use of non-letter symbols in all languages.



中文翻译:

非正式短文中汉语情感分类的正字特征

网络上的非正式短文充满了情感,因为它们经常反映出对突发新闻事件的未经过滤的即时反应。然而,情感密度与其语言环境和情感分类特征的贫乏形成鲜明对比。本文通过针对非ML和ML方法提出基于正交编码混合和代码切换的正交特征来解决这一挑战。我们的结果表明,按预期,在所有方法中,对于短文本,正字法功能通常都优于语法特征。当应用最先进的深度学习算法时,正交特征也显示出更大的贡献,特别是在准确性和正式文本方面。这一结果证实了正交变化特征对情感分类任务的有效性。这些结果被认为适用于所有语言,这是因为非拉丁文拼写语言在语言中存在常见的代码转换,并且在所有语言中都使用非字母符号。

更新日期:2020-11-23
down
wechat
bug