Automatically detecting feature requests from development emails by leveraging semantic sequence mining,Requirements Engineering

当前位置： X-MOL 学术 › Requirements Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Automatically detecting feature requests from development emails by leveraging semantic sequence mining
Requirements Engineering ( IF 2.1 ) Pub Date : 2021-03-30 , DOI: 10.1007/s00766-020-00344-y
Lin Shi , Celia Chen , Qing Wang , Barry Boehm

Mailing list is widely used as an important channel for communications between developers and stakeholders. It consists of emails that are posted for various purposes, such as reporting problems, seeking help in usage, managing projects, and discussing new features. Due to the intensive amount of new incoming emails every day, some valuable emails that intend to describe new features may get overlooked by developers. However, identifying these feature requests from development emails is a labor-intensive and challenging task. In this paper, we propose an automated solution to discover feature requests from development emails by leveraging semantic sequence patterns. First, we tag sentences in emails by using 81 fuzzy rules proposed in our previous study. Then we represent the semantic sequence with the contextual information of an email in a 2-g model. After applying sequence pattern mining techniques, we generate 10 semantic sequence patterns from 317 tagged emails that are randomly sampled from the Ubuntu community. We also conduct an empirical evaluation of their capability to discover feature requests from massive emails in Ubuntu and other four open source communities. The results show that our approach can effectively identify feature requests from these emails. Compared to existing baselines, our approach can achieve a better performance in terms of precision, recall, F1-score, AUC, and positive, with the average precision and recall for discovering feature requests from emails being 76% and 86%, respectively.

中文翻译：

通过利用语义序列挖掘来自动检测来自开发电子邮件的功能请求

邮件列表被广泛用作开发人员和利益相关者之间进行交流的重要渠道。它由出于各种目的而发布的电子邮件组成，例如报告问题，在使用中寻求帮助，管理项目以及讨论新功能。由于每天大量新收到的电子邮件，开发人员可能会忽略一些意在描述新功能的有价值的电子邮件。但是，从开发电子邮件中识别这些功能请求是一项劳动密集型且具有挑战性的任务。在本文中，我们提出了一种自动化的解决方案，可以利用语义序列模式从开发电子邮件中发现功能请求。首先，我们使用先前研究中提出的81条模糊规则来标记电子邮件中的句子。然后，我们在2-g模型中用电子邮件的上下文信息表示语义序列。应用序列模式挖掘技术后，我们从317个从Ubuntu社区中随机抽取的带标签电子邮件中生成了10个语义序列模式。我们还对它们从Ubuntu和其他四个开源社区中的大量电子邮件中发现功能请求的能力进行了经验评估。结果表明，我们的方法可以有效地识别这些电子邮件中的功能请求。与现有基准相比，我们的方法可以在准确性，召回率，F1分数，AUC和肯定性方面实现更好的性能，发现电子邮件中的功能请求的平均准确性和召回率分别为76％和86％。我们从317个带标签的电子邮件中生成了10个语义序列模式，这些电子邮件是从Ubuntu社区中随机抽取的。我们还对它们从Ubuntu和其他四个开源社区中的大量电子邮件中发现功能请求的能力进行了经验评估。结果表明，我们的方法可以有效地识别这些电子邮件中的功能请求。与现有基准相比，我们的方法可以在准确性，召回率，F1分数，AUC和肯定性方面实现更好的性能，发现电子邮件中的功能请求的平均准确性和召回率分别为76％和86％。我们从317个带标签的电子邮件中生成了10个语义序列模式，这些电子邮件是从Ubuntu社区中随机抽取的。我们还对它们从Ubuntu和其他四个开源社区中的大量电子邮件中发现功能请求的能力进行了经验评估。结果表明，我们的方法可以有效地识别这些电子邮件中的功能请求。与现有基准相比，我们的方法可以在准确性，召回率，F1分数，AUC和肯定性方面实现更好的性能，发现电子邮件中的功能请求的平均准确性和召回率分别为76％和86％。我们还对它们从Ubuntu和其他四个开源社区中的大量电子邮件中发现功能请求的能力进行了经验评估。结果表明，我们的方法可以有效地识别这些电子邮件中的功能请求。与现有基准相比，我们的方法可以在准确性，召回率，F1分数，AUC和肯定性方面实现更好的性能，发现电子邮件中的功能请求的平均准确性和召回率分别为76％和86％。我们还对它们从Ubuntu和其他四个开源社区中的大量电子邮件中发现功能请求的能力进行了经验评估。结果表明，我们的方法可以有效地识别这些电子邮件中的功能请求。与现有基准相比，我们的方法可以在准确性，召回率，F1分数，AUC和肯定性方面实现更好的性能，发现电子邮件中的功能请求的平均准确性和召回率分别为76％和86％。

更新日期：2021-03-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11