Synchronizing billion-scale automata
Introduction
Designing and developing a large-scale, correct, and complex system is not an easy task. Several validation techniques have been proposed to build some confidence in the developed systems, but testing stands out as one of the most practical one [23]. To automate the testing process, there has been much interest in testing with Finite State Machines (FSMs), e.g., see [8], [13], [33], [14], [25], [29]. To employ FSMs for testing, one needs to bring the system under test (SUT) to a particular state. It is quite easy to do that when a trusted reset input exists in the SUT. However, such a reset input is not always available.
A synchronizing sequence (also known as a reset sequence or a reset word) for an FSM is a sequence of inputs such that when applied to the FSM, the machine ends up in a particular state no matter at which state it initially is. Therefore a synchronizing sequence is a compound reset input for a machine. The shorter the synchronizing sequence is, the quicker is the synchronization process. Hence, shorter reset sequences are desirable in terms of synchronization time and energy spent. However, the problem of finding a shortest synchronizing sequence is NP-hard [11]. It is conjectured that for a synchronizing automaton with n states, the length of the shortest synchronizing sequence is at most , which is known as the Černý Conjecture in the literature [6], [7]. Posed half a century ago, the conjecture is still open but recently verified for all binary automata with at most 12 states, and all ternary automata with at most 8 states by using high-performance computing [21]. Furthermore, it has been shown that the probability that the conjecture does not hold for a random synchronizing binary automaton is exponentially small in terms of the number of states [5].
The motivation to study synchronizing sequences comes not only from the testing domain but also from different fields including automata theory, robotics, bio-computing, set theory, propositional calculus, model-based testing, and many more e.g., [17], [32], [2], [4], [27], [26]. For a survey of applications of synchronizing sequences, we refer the reader to [34] in which applications of synchronizing words together with a survey of theoretical results related to synchronizing automata are presented. In this work, we focus on large scale automata and FSMs since typically these automata/FSM models are not manually designed in practice. Instead, a high level formalism, such as SDL [16], StateCharts [12], UML [24], SystemVerilog [15], etc., is used for the design task. Analysis tools extract the underlying automata or FSMs by flattening the hierarchy, concurrency, data, and the binary encoding (e.g. in the case of hardware description languages). This flattening generally results in an enormous size for the underlying automaton/FSM. The well–known state space explosion problem in model-checking [9] is just one famous example of the scalability problems faced due to such flattening. Recently, researchers focus on finding synchronizing sequences for large-scale automata such as partial automata by using high-performance computing hardware such as GPUs [31].
Due to the hardness of finding a shortest sequence, there exist heuristics in the literature, known as synchronizing heuristics, to compute short synchronizing words. Among such heuristics are Greedy by [11], Cycle by [30], SynchroP by [28], SynchroPL by [28], and FastSynchro by [22]. In terms of complexity, these heuristics are ordered as follows: Greedy/Cycle with time complexity , FastSynchro with time complexity , and finally SynchroP/SynchroPL with time complexity , where n is the number of states and p is the size of the alphabet.
The fastest synchronizing heuristics, Greedy and Cycle, are also the earliest heuristics that appeared in the literature. Therefore Greedy and Cycle are usually considered as a baseline to evaluate the quality and the performance of novel heuristics. Newer heuristics do generate shorter synchronizing words, but by performing a more complex analysis, which implies a substantial runtime increase. The speed of Greedy and Cycle are unmatched to date. Yet, it has been recently shown that they can be implemented in a much faster way via various optimizations [20]. More optimizations have been proposed for the slower, but better heuristics [1] and their parallelization have also been studied in the literature [18].
All the aforementioned heuristics work in two phases; in the first phase, an auxiliary data structure is generated to summarize the shortest sequences that merge state pairs. In the second phase, by using this data structure, the sequence is constructed by concatenating some of these pairwise shortest sequences. Hence, the memory consumption of the heuristics is at least quadratic in terms of n. This complexity makes all the heuristics impractical even for automata with hundreds of thousands of states.
In this work, we focus on synchronizing large automata with Greedy. We modified the two-phase structure by removing the first phase and burdening the extra overhead via high-performance, parallel algorithms designed to utilize the power of multi-core CPUs or Graphics Processing Units (GPUs). For an effective and efficient parallelization, we observed the changes in the synchronization behavior and tried to utilize the CPU/GPU cores to their full potential. In our experiments, we obtained around and speedups with 16 CPU threads, on automata having and states, respectively. By utilizing a GPU after the pairwise synchronizing paths get longer, the speedups, compared to the sequential execution, are increased to and . Overall, via a hybrid solution using both CPU and GPU, we could synchronize a random automaton with states in s, states in 69.1 s, and billion states in 148.2 s where a single core execution of the same algorithm takes and seconds for and states, respectively.
The rest of the manuscript is organized as follows: Section 2 presents the background and notation. The proposed parallel Greedy heuristic is presented in Section 3 and further optimizations are described in Section 4. Section 5 presents the experimental results and Section 6 concludes the paper.
Section snippets
Background and notation
A (complete and deterministic) automaton is defined by a triple where is a finite set of n states, is a finite alphabet consisting of p input letters (or simply letters). is a transition function.
An element of the set is called a word. For a word , we use to denote the length of w, and is the empty word of length 0. We extend the transition function to a set of states and to a word in the usual way. We have , and for a word and
Parallel Greedy for a billion-scale automaton
With its quadratic memory complexity, it is impossible to execute Greedy, and also other heuristics, on automata having more than states; such an execution requires at least 40 GB memory. We re-structure Greedy to generate synchronization sequences for large-scale automata. As Greedy, the proposed approach always aims to find a shortest merging word to reduce the number of active states at each iteration. That is done in a brute-force manner, i.e., by trying all the words until one that
Further optimizations for large-scale automata synchronization
The aforementioned strategies improve the performance by leveraging the parallelism offered by multi-core and many-core architectures. Further improvement is possible with various extra optimizations. Without a careful implementation, the algorithms may suffer from false sharing, bad cache utilization, redundant computation, etc. Here we describe how a performance improvement can be obtained by applying intelligent optimization techniques.
Experimental results
We used two different architectures for the experiments. The preliminary CPU experiments to visualize the impact of sorting and memoization are performed on a machine running on 64 bit CentOS 6.5 equipped with 64 GB RAM and a dual-socket Intel Xeon E7-4870 v2 clocked at 2.30 GHz where each socket has 15 cores (30 in total). The main experiments are performed on a machine running on 64 bit Ubuntu 16.04 equipped with 1 TB RAM and a dual-socket Intel Xeon 6152 clocked at 2.10 GHz where each socket
Conclusion and future work
Finding synchronizing sequences for large-scale automata is important due to their applications in practice, especially in software testing. Since the shortest synchronizing sequence problem is NP-hard, various heuristics have been proposed with a quadratic memory complexity which fail to scale to large-scale automata. We propose an iterative algorithm which mimics the Greedy heuristic in the literature, i.e., finds a shortest sequence that decreases the set cardinality, and eventually finds
CRediT authorship contribution statement
Mustafa Kemal Taş: Software, Investigation, Writing - original draft, Visualization. Kamer Kaya: Software, Writing - original draft, Writing - review & editing, Supervision. Hüsnü Yenigün: Conceptualization, Writing - original draft, Writing - review & editing, Supervision, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work is supported by TÜBİTAK grant #114E569. We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPU used for this research.
References (35)
- et al.
Synchronizing monotonic automata
Theor. Comput. Sci.
(2004) - et al.
Algebraic synchronization criterion and computing reset words
Inf. Sci.
(2016) - et al.
Reduced checking sequences using unreliable reset
Inf. Process. Lett.
(2015) - et al.
Multicore and manycore parallelization of cheap synchronizing sequence heuristics
J. Parallel Distrib. Comput.
(2020) - et al.
Synchronizing heuristics: Speeding up the fastest
Expert Syst. Appl.
(2018) - et al.
Effective synchronizing algorithms
Expert Syst. Appl.
(2012) - et al.
Construction of checking sequences based on characterization sets
Comput. Commun.
(1995) Synchronizing finite automata with short reset words
Appl. Math. Comput.
(2009)- O.F. Altun, K. Atam, S. Karahoda, K. Kaya, Synchronizing heuristics: Speeding up the slowest, in: Testing Software and...
- et al.
Synchronizing automata with a letter of deficiency 2
Programmable and autonomous computing machine made of biomolecules
Nature
Poznámka k homogénnym experimentom s konečnými automatmi
Matematicko-fyzikálny časopis
On directable automata
Kybernetika
Testing software design modeled by finite-state machines
IEEE Trans. Software Eng.
Slowly synchronizing automata with fixed alphabet size
Inf. Comput.
Reset sequences for monotonic automata
SIAM J. Comput.
Cited by (6)
On quadratic lower bounds for deciding resettable finite automata
2024, Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition)Cellular automaton created as an m-ary product of algebraic quasi-multiautomata
2023, Soft ComputingAn Improved Algorithm for Finding the Shortest Synchronizing Words
2022, Leibniz International Proceedings in Informatics, LIPIcs