当前位置: X-MOL 学术Egypt. Inform. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ina-BWR: Indonesian bigram word rule for multi-label student complaints
Egyptian Informatics Journal ( IF 5.2 ) Pub Date : 2019-03-26 , DOI: 10.1016/j.eij.2019.03.001
Tora Fahrudin , Joko Lianto Buliali , Chastine Fatichah

Handling multi-label student complaints is one of interesting research topics. One of techniques used for handling multi-label student complaints is Bag of Word (BoW) method. In this research bigram word rule and preprocess are proposed to increase the accuracy of multi-label classification results. To show the effectiveness of the proposed method, data from Telkom University student data and additional relevant data by using hashtag are used as testing data. We develop Indonesian Bigram Word Rule for Multi-label Student Complaints (Ina-BWR) to identify multi-label student problems based on Bigram Word Rule. Ina-BWR consists of three processes such as preprocessing informal text, identifying complaint and object from text. Additional preprocessing techniques are conducted to formalize the text such as parsing a hashtag, correcting affixes word, correcting a conjunction word, parsing suffix people pronoun and correcting typo words. Indonesian bigram word rule is adopted from opinion identification rules with 3 additional corpuses (-)NN, (-)JJ and (-)VB to identify student complaints. To identify complaints, four label corpuses have been created manually. The experimental results show that Ina-BWR can increase Personal, Subject and Relation label accuracies. The best accuracy for four labels is obtained when Ina-BWR is combined with BoW method.



中文翻译:

Ina-BWR:针对多标签学生投诉的印尼语双字母词规则

处理多标签的学生投诉是有趣的研究主题之一。用于处理多标签学生投诉的一种技术是Word of Bag(BoW)方法。在这项研究中,提出了双字词规则和预处理,以提高多标签分类结果的准确性。为了证明所提方法的有效性,将来自电信大学学生数据和使用标签的其他相关数据用作测试数据。我们开发了针对多标签学生投诉的印尼文Bigram单词规则(Ina-BWR),以基于Bigram Word规则识别多标签学生问题。Ina-BWR由三个过程组成,例如预处理非正式文本,从文本中识别投诉和对象。进行了其他预处理技术以使文本形式化,例如解析井号,更正词缀词,纠正连接词,解析后缀人称代词和纠正错字。意见识别规则采用印尼语双字单词规则,并带有3个附加语料库(-)NN,(-)JJ和(-)VB来识别学生的投诉。为了识别投诉,已手动创建了四个标签语料库。实验结果表明,Ina-BWR可以提高“个人”,“主题”和“关系”标签的准确性。将Ina-BWR与BoW方法结合使用可获得四个标签的最佳精度。实验结果表明,Ina-BWR可以提高“个人”,“主题”和“关系”标签的准确性。将Ina-BWR与BoW方法结合使用可获得四个标签的最佳精度。实验结果表明,Ina-BWR可以提高“个人”,“主题”和“关系”标签的准确性。将Ina-BWR与BoW方法结合使用可获得四个标签的最佳精度。

更新日期:2019-03-26
down
wechat
bug