Egyptian Informatics Journal ( IF 5.0 ) Pub Date : 2020-05-21 , DOI: 10.1016/j.eij.2020.04.001 Shereen ElSayed , Mona Farouk
Although the number of Arabic language writers in social media is increasing, the research work targeting Author Profiling (AP) is at the initial development phase. This paper investigates Gender Identification (GI) (male or female) of authors posting Egyptian dialect tweets using Neural Networks (NN) models. Various architectures of NN are explored with extensive parameters’ selection such as simple Artificial Neural Network (ANN), Convolutional Neural Network (CNN), Long–Short Term Memory (LSTM), Convolutional Bidirectional Long-Short Term Memory (C-Bi-LSTM) and Convolutional Bidirectional Gated Recurrent Units (C-Bi-GRU) NN which is tuned for the GI problem at hand. The best acquired GI accuracy using C-Bi-GRU multichannel model is 91.37%. It is worth noting that the presence of the bidirectional layer as well as the convolutional layer in the NN models has significantly enhanced the GI accuracy.
中文翻译:
使用深度学习模型识别Twitter中埃及阿拉伯方言的性别
尽管社交媒体中阿拉伯语作家的人数正在增加,但针对作者概况分析(AP)的研究工作仍处于初期开发阶段。本文调查使用神经网络(NN)模型发布埃及方言推文的作者的性别识别(GI)(男女)。通过广泛的参数选择来探索各种NN体系结构,例如简单的人工神经网络(ANN),卷积神经网络(CNN),长短期记忆(LSTM),卷积双向长短期记忆(C-Bi-LSTM) )和卷积双向门控递归单元(C-Bi-GRU)NN,可针对当前的GI问题进行调整。使用C-Bi-GRU多通道模型获得的最佳GI精度为91.37%。