当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Sequence encoding incorporated CNN model for Email document sentiment classification
Applied Soft Computing ( IF 5.472 ) Pub Date : 2021-01-13 , DOI: 10.1016/j.asoc.2021.107104
Sisi Liu; Ickjai Lee

Document sentiment classification is an area of study that has been developed for decades. However, sentiment classification of Email data is rather a specialized field that has not yet been thoroughly studied. Compared to typical social media and review data, Email data has characteristics of length variance, duplication caused by reply and forward messages, and implicitness in sentiment indicators. Due to these characteristics, existing techniques are incapable of fully capturing the complex syntactic and relational structure among words and phrases in Email documents.

In this study, we introduce a dependency graph-based position encoding technique enhanced with weighted sentiment features, and incorporate it into the feature representation process. We combine encoded sentiment sequence features with traditional word embedding features as input for a revised deep CNN model for Email sentiment classification. Experiments are conducted on three sets of real Email data with adequate label conversion processes. Empirical results indicate that our proposed SSE-CNN model obtained the highest accuracy rate of 88.6%, 74.3% and 82.1% for three experimental Email datasets over other comparative state-of-the-art algorithms. Furthermore, our performance evaluations on the preprocessing and sentiment sequence encoding justify the effectiveness of Email preprocessing and sentiment sequence encoding with dependency-graph based position and SWN features on the improvement of Email document sentiment classification.



中文翻译:

结合了序列编码的CNN模型用于电子邮件文档情感分类

文献情感分类是几十年来研究的领域。但是,电子邮件数据的情感分类是一个尚未完全研究的专业领域。与典型的社交媒体和评论数据相比,电子邮件数据具有以下特征:长度变化,回复和转发消息引起的重复以及情感指标的隐含性。由于这些特性,现有技术无法完全捕获电子邮件文档中单词和短语之间复杂的句法和关系结构。

在这项研究中,我们介绍了一种基于加权图的位置编码技术,该技术通过加权情感特征得到增强,并将其纳入特征表示过程。我们将编码的情感序列特征与传统的词嵌入功能相结合,以作为用于电子邮件情感分类的修订后的深度CNN模型的输入。实验针对具有适当标签转换过程的三组真实电子邮件数据进行。实验结果表明,相对于其他比较先进的算法,我们提出的SSE-CNN模型获得了三个实验Email数据集的最高准确率,分别为88.6%,74.3%和82.1%。此外,

更新日期:2021-01-20
全部期刊列表>>
微生物研究
亚洲大洋洲地球科学
NPJ欢迎投稿
自然科研论文编辑
ERIS期刊投稿
欢迎阅读创刊号
自然职场,为您触达千万科研人才
spring&清华大学出版社
城市可持续发展前沿研究专辑
Springer 纳米技术权威期刊征稿
全球视野覆盖
施普林格·自然新
chemistry
物理学研究前沿热点精选期刊推荐
自然职位线上招聘会
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
阿拉丁试剂right
上海中医药大学
浙江大学
西湖大学
化学所
北京大学
清华
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
清华大学-1
武汉大学
浙江大学
天合科研
x-mol收录
试剂库存
down
wechat
bug