A multi-objective search based approach to identify reusable software components,Journal of Computer Languages

当前位置： X-MOL 学术 › J. Comput. Lang. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A multi-objective search based approach to identify reusable software components
Journal of Computer Languages ( IF 1.7 ) Pub Date : 2019-04-16 , DOI: 10.1016/j.cola.2019.01.006
Amit Rathee , Jitender Kumar Chhabra

Component-based-software-development (CBSD) is one of the most recent trends in the software development industry and its success majorly depends on the quality of the software components. Good quality software components are those components, which are internally strongly cohesive and are independent of others. Use of such components helps in faster development of a software and reduces the maintenance efforts in future. Such components can be identified from different software repositories and can be reused whenever needed. Hence, proper identification of such reusable components is a promising area of research, and the same has been targeted in this paper. This paper model Reusable Software Component Identification (RSCI) problem as a search-based multi-objective problem in order to identify optimized reusable software components from the object-oriented (OO) source code of a software system. For OO software paradigm, we consider the component as an individual class or group of connected classes which can be reused with least modifications. To identify this, three types of relationship are proposed in this paper, namely (1) Frequent Usage Pattern (FUP) based cohesion (2) semantic relatedness based cohesion and (3) co-change based coupling. The proposed approach optimizes by simultaneously maximizing both types of cohesion and minimizing the coupling of a given component. The FUP based cohesion maximization is based on the newly proposed cohesion metric called PatternCohesion which is computed using FUP information extracted from a given software element (class/ interface in an object-oriented language). The FUP of a class comprises of those member variables of all classes of the software, which are directly or indirectly accessed by member functions defined in the class under consideration. The PatternCohesion metric helps to measure the functional relatedness among different pairs of software elements (classes). The semantic relatedness based cohesion maximization is based on TF-IDF based Cosine Similarity measurement using tokens extracted from three main parts of the source code of a given software element namely class/ interface declaration statement, data types of different member variables and member function signatures. The co-change based coupling is computed from the change-history of the underlying software system using well-known data mining metric called Support. Finally, reusable components are identified as different cohesive groups (consisting of one or more connected classes) using NSGA-III Algorithm. The proposed approach is empirically evaluated over six open source software systems belonging to different domains. The need of considering all three types of relations is established by comparing their performance with relations taken individually and two at a time. The obtained results indicate a higher quality for different software components measured in terms of reusability characteristics.

中文翻译：

基于多目标搜索的方法来识别可重用的软件组件

基于组件的软件开发（CBSD）是软件开发行业中的最新趋势之一，其成功主要取决于软件组件的质量。高质量的软件组件是那些在内部具有很高的凝聚力并且彼此独立的组件。使用此类组件有助于更快地开发软件，并减少将来的维护工作。可以从不同的软件存储库中识别此类组件，并在需要时可以重复使用。因此，正确识别此类可重复使用的组件是一个有前途的研究领域，并且本文也将其作为目标。本文将可重用软件组件标识（RSCI）问题建模为基于搜索的多目标问题，以便从软件系统的面向对象（OO）源代码中识别优化的可重用软件组件。对于OO软件范式，我们将组件视为一个单独的类或一组连接的类，它们可以进行最少的修改即可重用。为了识别这种关系，本文提出了三种类型的关系，即（1）基于频繁使用模式（FUP）的内聚力（2）基于语义相关性的内聚和（3）基于协变的耦合。通过同时最大化两种内聚力类型和最小化给定组件的耦合，所提出的方法进行了优化。基于FUP的内聚最大化基于新提出的内聚度量称为我们将组件视为一个单独的类或一组连接的类，这些类可以通过最少的修改就可以重用。为了识别这种关系，本文提出了三种类型的关系，即（1）基于频繁使用模式（FUP）的内聚力（2）基于语义相关性的内聚和（3）基于协变的耦合。通过同时最大化两种内聚力和最小化给定组件的耦合，所提出的方法进行了优化。基于FUP的内聚最大化基于新提出的内聚度量称为我们将组件视为一个单独的类或一组连接的类，这些类可以通过最少的修改就可以重用。为了识别这种关系，本文提出了三种类型的关系，即（1）基于频繁使用模式（FUP）的内聚力（2）基于语义相关性的内聚和（3）基于协变的耦合。通过同时最大化两种内聚力类型和最小化给定组件的耦合，所提出的方法进行了优化。基于FUP的内聚最大化基于新提出的内聚度量称为通过同时最大化两种内聚力类型和最小化给定组件的耦合，所提出的方法进行了优化。基于FUP的内聚最大化基于新提出的内聚度量称为通过同时最大化两种内聚力类型和最小化给定组件的耦合，所提出的方法进行了优化。基于FUP的内聚最大化基于新提出的内聚度量称为使用从给定软件元素（面向对象语言的类/接口）中提取的FUP信息计算出的PatternCohesion。一个类的FUP包含软件所有类的那些成员变量，这些变量可以由所考虑的类中定义的成员函数直接或间接访问。模式凝聚力度量标准有助于衡量不同对软件元素（类）之间的功能相关性。基于语义相关性的内聚最大化基于基于TF-IDF的余弦相似度测量，该度量使用从给定软件元素的源代码的三个主要部分（类/接口声明语句，不同成员变量的数据类型和成员函数签名）中提取的标记来进行。基于co-change的耦合是使用称为Support的众所周知的数据挖掘指标从基础软件系统的更改历史计算出来的。最后，使用NSGA-III算法将可重用组件标识为不同的内聚组（由一个或多个连接的类组成）。对属于不同领域的六个开源软件系统进行了经验评估。通过将三种类型的关系与单独处理的关系和一次处理的关系进行比较，可以确定需要考虑所有三种关系。获得的结果表明，就可重用性特性而言，不同软件组件的质量更高。

更新日期：2019-04-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文