Custom-tailored clone detection for IEC 61131-3 programming languages
Introduction
During the evolution of software systems, code cloning is a common practice (Mondal et al., 2020) for reusing software artifacts. To cope with an increasing market for custom-tailored software systems, developers often follow a clone-and-own approach where existing variants are copied and altered to create new variants (Fischer et al., 2014). It is an unsustainable approach that reduces the overall software quality due to bug propagation, increases the maintenance effort, and hinders further reuse (Deissenboeck et al., 2010). In the field of clone detection, research focuses on high-level programming languages such as Java or C (Mondal et al., 2020, Ain et al., 2019, Roy and Cordy, 2007, Bellon et al., 2007). In the domain of automated production systems ( mbox aPS:̱mbox ), code cloning is a common practice due to frequently changing products, customer requirements, and altered regulatory guidelines (Durdik et al., 2012, Legat et al., 2013).
The state of the art programming languages for programming logical controller software is defined in the IEC 61131-3 standard (International Electrotechnical Commision, 2009). It comprises five programming languages, the two textual languages Structured Text (ST) and Instruction List (IL), and the three graphical languages Sequential Function Chart (SFC), Ladder Diagram (LD), and Function Block Diagram (FBD). The standard allows the nesting of languages, such as using Structured Text (ST) in Function Block Diagram (FBD) implementations. The control program developers can select the language that is best suited for a particular task, significantly increasing their productivity. Programs implemented according to IEC 61131-3 are divided into program organization units (POUs) as the smallest software unit in a program. Such systems are often reused by copying the whole system and then modifying it to create new and independent system variants (referred to as clone-and-own). Furthermore, developers also often reuse single POUs within a system (referred to as classical code cloning), for example, the POU that controls a sorting conveyor that can occur several times in a production system (Vogel-Heuser and Ocker, 2018, Bougouffa et al., 2019).
To restore the sustainable development of cloned system variants, they need to be re-engineered into a structured reuse approach, such as a software product line (SPL) (Northrop, 2002, Fischer et al., 2018). Therefore, a detailed analysis of system variants concerning code clones within a variant (intra clone detection) and commonalities and differences between cloned variants (inter clone detection) is essential. It serves as a first step to re-engineer system variants into an SPL (Breivold et al., 2008, Krueger, 2001) and to refactor code clones into reusable and configurable software artifacts such as library components (Vogel-Heuser et al., 2018).
We propose a fully customizable comparison approach for IEC 61131-3 in order to support the detection of clones within a variant (intra variant clone detection) and between variants (inter variant clone detection). This supports developers in tracing clones within and between variants, which helps them create reusable components within systems and migrating system variants into an SPL, respectively. Specifically, the contributions of this paper are as follows:
- •
A model-based, fine-grained, and fully customizable approach for the detection of code clones within variants (intra clone detection) and analysis of commonalities and differences between cloned variants (inter clone detection) of IEC 61131-3 programs composed of arbitrarily nested sub languages.
- •
Publicly available prototype implementation called Variability Analysis Toolkit (VAT), evaluation data and results.1
- •
A mutation framework for the evaluation of clone detection tools for IEC 61131-3 systems.
- •
Detailed evaluation and analysis of the approach by applying it to a large clone data set created using the mutation framework, as well as to the PPU and xPPU case study systems.
The remainder of this paper is structured as follows: Section 2 provides relevant background on the IEC 61131-3 standard with the utilized programming languages and describes code clones and variability analysis. Section 3 presents our approach for detecting clones within and between variants. In Section 4, we explain the implementation of our approach as a tool called VAT. In Section 5, we evaluate our approach by performing qualitative and quantitative analyses. Finally, we discuss related work in Section 6 and conclude in Section 7.
Section snippets
Background
This section provides background on IEC 61131-3 control software, types of code clones, and variability analysis.
Clone detection approach
This section presents our approach for the detection of code clones in IEC 61131-3 control software. We first explain the general comparison approach and then each step in more detail in the following sections. Fig. 7 illustrates the process for the detection of code clones.
In the first step, the control software is parsed ①. The parsing process transforms a PLCOpenXML file into a model based on a set of meta-models. We created these meta-models as an abstraction of the IEC 61131-3 standard to
Implementation
In order to evaluate our approach, we implemented it in a publicly available tool we call the Variability Analysis Toolkit (VAT).3
Evaluation
We evaluated different aspects of our clone detection approach. The correctness, measured in precision and recall, of results are crucial for detecting code clones within software variants and analyzing commonalities and differences between software variants. Otherwise, incorrectly matched elements inevitably compromise subsequent steps such as refactoring code clones into library components or consolidating a set of variants into an SPL. Thus, analyzing the results concerning their correctness
Related work
Clone and own is a common and popular reuse strategy in the software development domain. In the past decades, the interest in code clones is also exhibited in existing research’s wealth. In general, clone-detection aims to reduce large software systems’ maintenance effort by tracing clones or transferring a software system into an SPL (Juergens et al., 2009). Both activities require a detailed analysis of the respective software systems. Most of the research focused on detecting code clones in
Conclusion and future work
With an increasing interest in variant variety for industrial products, variability has become a key factor of many software systems. In the domain of mbox aPS:̱mbox and their control, software often remains in use for decades. To reduce such a system’s maintenance effort, the detection of clones and analysis of variability is crucial. On the one hand, code-clones can be refactored into reusable artifacts such as library components. And on the other hand, the variability analysis can support
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work has been supported by the DFG (German Research Foundation) (SCHA 1635/12-1) and (VO 937/31-1).
Kamil Rosiak is a research assistant at the Institute of Software Engineering and Automotive Informatics.
Before, he worked in the field of electronic intelligence and received his master’s degree in 2019. His research interests are on reverse-engineering of legacy software systems and the analysis of programming languages.
References (60)
- et al.
Key maturity indicators for module libraries for PLC-based control software in the domain of automated production systems
IFAC-PapersOnLine
(2018) - et al.
Maintainability and evolvability of control software in machine and plant manufacturing—An industrial survey
Control Eng. Pract.
(2018) - et al.
A systematic review on code clone detection
IEEE Access
(2019) - et al.
Models are code too: Near-miss clone detection for Simulink models
- et al.
Near-miss model clone detection for simulink models
- et al.
A comprehensive study of software product line frameworks
Int. J. Comput. Appl.
(2016) - et al.
Difference and union of models
TwinCAT3 by beckhoff automation gmbh co. kg
(2020)- et al.
Comparison and evaluation of clone detection tools
IEEE Trans. Softw. Eng.
(2007) - Beuche, D., 2004. Variants and Variability Management with pure:: variants. In: 3rd Software Product Line Conference...
Visualization of variability analysis of control software from industrial automation systems
Clone detection in automotive model-based development
Towards sustainability guidelines for long-living software systems
Similarity analysis of control software using graph mining
A qualitative study of variability management of control software for industrial automation systems
Enhancing clone-and-own with systematic reuse for developing software variants
The ECCO tool: Extraction and composition for clone-and-own
Reengineering workflow for planned reuse of IEC 61131-3 legacy software
Simian-similarity analyser
Analysis of Industrial Control System Software to Detect Semantic Clones
Automatische synthese von familienmodellen durch analyse von block-basierten funktionsmodellen
61131-3: Programmable controllers–part 3: Programming languages
Programmable logic controllers – part 3: Programming languages
Cited by (13)
Introduction of an Assistant for Low-Code Programming of Hydraulic Components in Mobile Machines
2024, Lecture Notes in Civil EngineeringThe e4CompareFramework: Annotation-based Software Product-Line Extraction
2023, ACM International Conference Proceeding SeriesTrue Variability Shining Through Taxonomy Mining
2023, ACM International Conference Proceeding SeriesBad Smells in Control Software for automated Production Systems
2023, At-AutomatisierungstechnikA model-based mutation framework for IEC61131-3 manufacturing systems
2023, At-AutomatisierungstechnikClone Detection in IEC 61499 using Metainformation
2023, IEEE International Conference on Emerging Technologies and Factory Automation, ETFA
Kamil Rosiak is a research assistant at the Institute of Software Engineering and Automotive Informatics.
Before, he worked in the field of electronic intelligence and received his master’s degree in 2019. His research interests are on reverse-engineering of legacy software systems and the analysis of programming languages.
Alexander Schlie graduated in computer science at the TU Braunschweig, Germany and received his M.Sc. in 2016.
He works as a research assistant at the Institute of Software Engineering and Automotive Informatics.
His research interests are on reverse-engineering variability information from legacy systems to allow for their restructuring and migration towards software product lines.
Lukas Linsbauer is currently a postdoctoral researcher at the Institute of Software Engineering and Automotive Informatics at the Technical University of Braunschweig in Germany.
His research interests include software product lines, traceability, and version control systems. He received his Ph.D. in Computer Science (Software Engineering) in 2016 from the Johannes Kepler University (JKU) in Linz (Austria) where he also spent time as a postdoctoral researcher at the Institute for Software Systems Engineering (ISSE) and the Christian Doppler Laboratory (CDL) for Monitoring and Evolution of Very-Large-Scale Software Systems (MEVSS).
Birgit Vogel-Heuser is a Professor and Director of the Institute of Automation and In-formation Systems at Technical University of Munich.
Her main research interests are systems and software engineering, and modeling of distributed and reliable embedded systems for automation and automated Production Systems.
Ina Schaefer is chair of the Institute of Software Engineering and Automotive Informatics at the Technische Universität Braunschweig.
She received her Ph.D. degree from the TU Kaiserslautern and worked as a postdoc at the Chalmers University of Technology in Gothenburg, Sweden.
Her research interests are verification and testing methods for variant-rich and evolving software systems.