当前位置: X-MOL 学术AI EDAM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using evolutionary algorithms to select text features for mining design rationale
AI EDAM ( IF 1.7 ) Pub Date : 2020-01-30 , DOI: 10.1017/s0890060420000037
Miriam Lester , Miguel Guerrero , Janet Burge

At its heart, design is a decision-making process. These decisions, and the reasons for making them, comprise the design rationale (DR) for the designed artifact. If available, DR provides a comprehensive record of the reasoning behind the decisions made during the design. Unfortunately, while this information is potentially quite valuable, it is usually not explicitly captured. Instead, it is often buried in other design and development artifacts. In this paper, we study how to identify rationale from text documents, specifically software bug reports and design discussion transcripts. The method we examined is statistical text mining where a model is built to use document features to classify sentences. Choosing which features are most likely to be good predictors is important. We studied two evolutionary algorithms to optimize feature selection – ant colony optimization and genetic algorithms. We found that for many types of rationale, models built with an optimized feature set outperformed those built using all the document features.

中文翻译:

使用进化算法选择文本特征以挖掘设计原理

从本质上讲,设计是一个决策过程。这些决定以及做出这些决定的原因构成了设计工件的设计原理 (DR)。如果可用,DR 会提供设计期间决策背后推理的全面记录。不幸的是,虽然这些信息可能非常有价值,但通常不会明确捕获。相反,它通常隐藏在其他设计和开发工件中。在本文中,我们研究如何从文本文档中识别基本原理,特别是软件错误报告和设计讨论记录。我们研究的方法是统计文本挖掘,其中建立了一个模型来使用文档特征对句子进行分类。选择哪些特征最有可能成为良好的预测器很重要。我们研究了两种进化算法来优化特征选择——蚁群优化和遗传算法。我们发现,对于许多类型的基本原理,使用优化特征集构建的模型优于使用所有文档特征构建的模型。
更新日期:2020-01-30
down
wechat
bug