当前位置: X-MOL 学术Software Qual. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification of application reviews into software maintenance tasks using data mining techniques
Software Quality Journal ( IF 1.9 ) Pub Date : 2020-08-28 , DOI: 10.1007/s11219-020-09529-8
Assem Al-Hawari , Hassan Najadat , Raed Shatnawi

Mobile application reviews are considered a rich source of information for software engineers to provide a general understanding of user requirements and technical feedback to avoid main programming issues. Previous researches have used traditional data mining techniques to classify user reviews into several software maintenance tasks. In this paper, we aim to use associative classification (AC) algorithms to investigate the performance of different classifiers to classify reviews into several software maintenance tasks. Also, we proposed a new AC approach for review mining (ACRM). Review classification needs preprocessing steps to apply natural language preprocessing and text analysis. Also, we studied the influence of two feature selection techniques (information gain and chi-square) on classifiers. Association rules give a better understanding of users’ intent since they discover the hidden patterns in words and features that are related to one of the maintenance tasks, and present it as class association rules (CARs). For testing the classifiers, we used two datasets that classify reviews into four different maintenance tasks. Results show that the highest accuracy was achieved by AC algorithms for both datasets. ACRM has the highest precision, recall, F-score, and accuracy. Feature selection helps improving the classifiers’ performance significantly.

中文翻译:

使用数据挖掘技术将应用程序审查分类为软件维护任务

移动应用程序评论被认为是软件工程师提供对用户需求和技术反馈的一般理解以避免主要编程问题的丰富信息来源。以前的研究使用传统的数据挖掘技术将用户评论分类为几个软件维护任务。在本文中,我们旨在使用关联分类 (AC) 算法来研究不同分类器的性能,以将评论分类为多个软件维护任务。此外,我们提出了一种用于评论挖掘(ACRM)的新 AC 方法。评论分类需要预处理步骤来应用自然语言预处理和文本分析。此外,我们研究了两种特征选择技术(信息增益和卡方)对分类器的影响。关联规则可以更好地理解用户的意图,因为它们发现了与维护任务之一相关的单词和特征中的隐藏模式,并将其呈现为类关联规则 (CAR)。为了测试分类器,我们使用了两个数据集,将评论分为四个不同的维护任务。结果表明,AC 算法对两个数据集都实现了最高准确度。ACRM 具有最高的精确度、召回率、F 分数和准确度。特征选择有助于显着提高分类器的性能。结果表明,AC 算法对两个数据集都实现了最高准确度。ACRM 具有最高的精确度、召回率、F 分数和准确度。特征选择有助于显着提高分类器的性能。结果表明,AC 算法对两个数据集都实现了最高准确度。ACRM 具有最高的精确度、召回率、F 分数和准确度。特征选择有助于显着提高分类器的性能。
更新日期:2020-08-28
down
wechat
bug