当前位置: X-MOL 学术Inf. Softw. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A graph-based clustering algorithm for software systems modularization
Information and Software Technology ( IF 3.9 ) Pub Date : 2020-10-27 , DOI: 10.1016/j.infsof.2020.106469
Babak Pourasghar , Habib Izadkhah , Ayaz Isazadeh , Shahriar Lotfi

Context:

Clustering algorithms, as a modularization technique, are used to modularize a program aiming to understand large software systems as well as software refactoring. These algorithms partition the source code of the software system into smaller and easy-to-manage modules (clusters). The resulting decomposition is called the software system structure (or software architecture). Due to the NP-hardness of the modularization problem, evolutionary clustering approaches such as the genetic algorithm have been used to solve this problem. These methods do not make much use of the information and knowledge available in the artifact dependency graph which is extracted from the source code.

Objective:

To overcome the limitations of the existing modularization techniques, this paper presents a new modularization technique named GMA (Graph-based Modularization Algorithm).

Methods:

In this paper, a new graph-based clustering algorithm is presented for software modularization. To this end, the depth of relationships is used to compute the similarity between artifacts, as well as seven new criteria are proposed to evaluate the quality of a modularization. The similarity presented in this paper enables the algorithm to use graph-theoretic information.

Results:

To demonstrate the applicability of the proposed algorithm, ten folders of Mozilla Firefox with different domains and functions, along with four other applications, are selected. The experimental results demonstrate that the proposed algorithm produces modularization closer to the human expert’s decomposition (i.e., directory structure) than the other existing algorithms.

Conclusion:

The proposed algorithm is expected to help a software designer in the software reverse engineering process to extract easy-to-manage and understandable modules from source code.



中文翻译:

用于软件系统模块化的基于图的聚类算法

内容:

聚类算法作为一种模块化技术,用于对旨在理解大型软件系统以及软件重构的程序进行模块化。这些算法将软件系统的源代码划分为更小且易于管理的模块(集群)。最终的分解称为软件系统结构(或软件体系结构)。由于模块化问题的NP难点,进化聚类方法(例如遗传算法)已用于解决此问题。这些方法没有充分利用从源代码中提取的工件相关性图中可用的信息和知识。

目的:

为了克服现有模块化技术的局限性,本文提出了一种新的模块化技术,称为GMA(基于图形的模块化算法)。

方法:

本文提出了一种新的基于图的聚类算法进行软件模块化。为此,使用关系的深度来计算工件之间的相似性,并提出了七个新的标准来评估模块化的质量。本文提出的相似性使该算法能够使用图论信息。

结果:

为了演示该算法的适用性,选择了十个具有不同域和功能的Mozilla Firefox文件夹,以及四个其他应用程序。实验结果表明,与其他现有算法相比,所提出的算法产生的模块化程度更接近人类专家的分解(即目录结构)。

结论:

该算法有望在软件逆向工程过程中帮助软件设计人员从源代码​​中提取易于管理且易于理解的模块。

更新日期:2020-10-30
down
wechat
bug