Pride: Prioritizing Documentation Effort Based on a PageRank-Like Algorithm and Simple Filtering Rules,IEEE Transactions on Software Engineering

当前位置： X-MOL 学术 › IEEE Trans. Softw. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Pride: Prioritizing Documentation Effort Based on a PageRank-Like Algorithm and Simple Filtering Rules
IEEE Transactions on Software Engineering ( IF 6.5 ) Pub Date : 2022-04-29 , DOI: 10.1109/tse.2022.3171469
Weifeng Pan ₁ , Hua Ming ₂ , Dae-Kyoo Kim ₃ , Zijiang Yang ₄

Affiliation

Code documentation can be helpful in many software quality assurance tasks. However, due to resource constraints (e.g., time, human resources, and budget), programmers often cannot document their work completely and timely. In the literature, two approaches (one is supervised and the other is unsupervised) have been proposed to prioritize documentation effort to ensure the most important classes to be documented first. However, both of them contain several limitations. The supervised approach overly relies on a difficult-to-obtain labeled data set and has high computation cost. The unsupervised one depends on a graph representation of the software structure, which is inaccurate since it neglects many important couplings between classes. In this paper, we propose an improved approach, named Pride, to prioritize documentation effort. First, Pride uses a weighted directed class coupling network to precisely describe classes and their couplings. Second, we propose a PageRank-like algorithm to quantify the importance of classes in the whole class coupling network. Third, we use a set of software metrics to quantify source code complexity and further propose a simple but easy-to-operate filtering rule. Fourth, we sort all the classes according to their importance in descending order and use the filtering rule to filter out unimportant classes. Finally, a threshold

$k$

is utilized, and the top-

$k$

% ranked classes are the identified important classes to be documented first. Empirical results on a set of nine software systems show that, according to the average ranking of the Friedman test, Pride is superior to the existing approaches in the whole data set.

中文翻译：

骄傲：基于类似 PageRank 的算法和简单过滤规则对文档工作进行优先排序

代码文档有助于完成许多软件质量保证任务。然而，由于资源限制（例如，时间、人力资源和预算），程序员通常无法完整及时地记录他们的工作。在文献中，已经提出了两种方法（一种是监督的，另一种是非监督的）来确定文档工作的优先级，以确保首先记录最重要的类。然而，它们都包含一些限制。监督方法过度依赖难以获取的标记数据集并且计算成本高。无监督的依赖于软件结构的图形表示，这是不准确的，因为它忽略了类之间的许多重要耦合。在本文中，我们提出了一种名为 Pride 的改进方法来优先考虑文档工作。第一的，Pride 使用加权定向类耦合网络来精确描述类及其耦合。其次，我们提出了一种类似于 PageRank 的算法来量化类在整个类耦合网络中的重要性。第三，我们使用一组软件度量来量化源代码的复杂性，并进一步提出一个简单但易于操作的过滤规则。第四，我们根据重要性降序对所有类进行排序，并使用过滤规则过滤掉不重要的类。最后，一个门槛我们使用一组软件指标来量化源代码的复杂性，并进一步提出一个简单但易于操作的过滤规则。第四，我们根据重要性降序对所有类进行排序，并使用过滤规则过滤掉不重要的类。最后，一个门槛我们使用一组软件指标来量化源代码的复杂性，并进一步提出一个简单但易于操作的过滤规则。第四，我们根据重要性降序对所有类进行排序，并使用过滤规则过滤掉不重要的类。最后，一个门槛

$k$

被利用，并且顶部

$k$

% ranked classes 是确定的重要类，首先要记录。一组九个软件系统的实证结果表明，根据 Friedman 测试的平均排名，Pride 在整个数据集中优于现有方法。

更新日期：2022-04-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11