当前位置: X-MOL 学术Data Knowl. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel domain and event adaptive tweet augmentation approach for enhancing the classification of crisis related tweets
Data & Knowledge Engineering ( IF 2.5 ) Pub Date : 2021-07-21 , DOI: 10.1016/j.datak.2021.101913
Dharini Ramachandran , Parvathi R.

One of the purposes of detecting the crisis related tweets is the ability to single out the tweets that provide information about the helps needed and offered. Classification of such tweets is difficult because of the unavailability of sufficient annotated tweets in those categories. To facilitate such classifications, a domain and event adaptive augmentation approach is proposed. The main objective of the research is to enhance the classification of crisis related tweets that have less training samples. The proposed algorithms are designed to integrate the innate domain and event information during the selection of words for augmentation. Components such as CrisisLex lexicon, Word2Vec embeddings and WordNet are utilized for the proposed augmentation. Experimentation is carried out to substantiate the benefits of augmentation. Results indicate increased performance of the classifier when provided with the expanded dataset including the augmented and original tweets. To combat the problem of overfitting and class imbalance arising due to the lesser training samples, a novel tweets augmentation algorithm can be utilized. The advantage in the proposed algorithms is the ability to retain the structure and inherent nature of the tweets during the augmentation.



中文翻译:

一种新的领域和事件自适应推文增强方法,用于增强危机相关推文的分类

检测与危机相关的推文的目的之一是能够挑选出提供有关所需和提供的帮助信息的推文。由于在这些类别中没有足够的带注释的推文,因此很难对此类推文进行分类。为了促进这种分类,提出了一种域和事件自适应增强方法。该研究的主要目标是增强对训练样本较少的危机相关推文的分类。所提出的算法旨在在选择用于增强的单词期间整合先天域和事件信息。CrisisLex 词典、Word2Vec 嵌入和 WordNet 等组件用于提议的增强。进行实验以证实增强的好处。结果表明,当提供包含增强和原始推文的扩展数据集时,分类器的性能提高。为了解决由于训练样本较少而引起的过拟合和类不平衡问题,可以使用一种新颖的推文增强算法。所提出算法的优点是能够在增强过程中保留推文的结构和固有性质。

更新日期:2021-09-27
down
wechat
bug