Next Article in Journal
Fast CU Partition Decision Algorithm for VVC Intra Coding Using an MET-CNN
Next Article in Special Issue
Efficient FPGA Implementation of an RFIR Filter Using the APC–OMS Technique with WTM for High-Throughput Signal Processing
Previous Article in Journal
Effects of Exercise Type and Gameplay Mode on Physical Activity in Exergame
Previous Article in Special Issue
A Knowledge Base Technique for Detecting Multiple High-Speed Serial Interface Synchronization Errors in Multiprocessor-Based Real-Time Embedded Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using a Double-Core Structure to Reduce the LUT Count in FPGA-Based Mealy FSMs

by
Alexander Barkalov
1,2,*,
Larysa Titarenko
1,3 and
Kazimierz Krzywicki
4,*
1
Institute of Metrology, Electronics and Computer Science, University of Zielona Gora, ul. Licealna 9, 65-417 Zielona Gora, Poland
2
Department of Computer Science and Information Technology, Vasyl Stus’ Donetsk National University (in Vinnytsia), 600-Richya Str. 21, 21021 Vinnytsia, Ukraine
3
Department of Infocommunication Engineering, Faculty of Infocommunications, Kharkiv National University of Radio Electronics, Nauky Avenue 14, 61166 Kharkiv, Ukraine
4
Department of Technology, The Jacob of Paradies University, ul. Teatralna 25, 66-400 Gorzow Wielkopolski, Poland
*
Authors to whom correspondence should be addressed.
Electronics 2022, 11(19), 3089; https://doi.org/10.3390/electronics11193089
Submission received: 25 August 2022 / Revised: 24 September 2022 / Accepted: 26 September 2022 / Published: 27 September 2022
(This article belongs to the Special Issue Embedded Systems: Fundamentals, Design and Practical Applications)

Abstract

:
A method is proposed which aims at reducing the numbers of look-up table (LUT) elements in logic circuits of Mealy finite state machines (FSMs). The FSMs with twofold state assignment are discussed. The reduction is achieved due to using two cores of LUTs for generating partial Boolean functions. One core is based on maximum binary state codes. The second core uses extended state codes. Such an approach allows reducing the number of LUTs in the block of state codes’ transformation. The proposed approach leads to LUT-based Mealy FSM circuits having three levels of logic blocks. Each partial function for any core is represented by a single-LUT circuit. A formal method is proposed for redistribution of states between these cores. An example of synthesis is shown to explain peculiarities of the proposed method. An example of state redistribution is given. The results of experiments conducted with standard benchmarks show that the double-core approach produces LUT-based FSM circuits with better area-temporal characteristics than they are for circuits produced by other investigated methods (Auto and One-hot of Vivado, JEDI, and twofold state assignment). Both the LUT counts and maximum operating frequencies are improved. The gain in LUT counts varies from 5.74% to 36.92%, and the gain in frequency varies from 5.42% to 12.4%. These improvements are connected with a very small growth of the power consumption (less than 1%). The advantages of the proposed approach increase as the number of FSM inputs and states increases.

1. Introduction

Our time is characterized by the widespread penetration of various embedded systems into all spheres of human activity [1,2,3]. Various sequential devices are an integral part of almost every embedded system [4,5]. Very often, the behaviour of a sequential device is represented using the model of Mealy finite state machine (FSM) [6,7]. Often in the FSM design process, designers strive to balance the values of the three main characteristics of a resulting circuit [8,9]. These characteristics are the occupied chip area, maximum operating frequency, and power consumption. The values of these characteristics are closely related [10]. As a rule, the occupied chip area has the greatest influence on the values of other characteristics [11]. The occupied chip area can be reduced using methods of structural decomposition [11]. One of these methods is a method of twofold state assignment (TSA) leading to three-level FSM circuits [12]. The TSA is aimed at Mealy FSMs implemented with field-programmable gate arrays (FPGAs) [13,14,15,16,17].
We chose FPGAs as the basis for the implementation of FSM circuits, since they are widely used for designing various digital systems [18]. We discuss FSM circuits based on configurable logic blocks (CLBs) consisting of look-up table (LUT) elements and programmable flip-flops. Now, the largest manufacturer of FPGA chips is AMD Xilinx [19]. Due to it, we focus this paper on FPGAs of AMD Xilinx. We propose a method of reducing the numbers of LUTs (LUT counts) in the FPGA-based circuits of Mealy FSMs.
The main disadvantage of twofold FSMs is the need to convert all maximum binary state codes (MBCs) into so-called extended state codes (ESCs) [12]. For this purpose, an additional block is used to transform the maximum binary state codes into the extended state codes. This block consumes some of the FPGA chip’s internal resources (LUTs and programmable interconnections). In this paper, we propose a method which allows reducing the overhead connected with the transformation of state codes.
The main contribution of this paper is a novel design method aimed at reducing the LUT counts in the circuits of FPGA-based Mealy FSMs with twofold state assignment. We propose to represent an FSM circuit as a double-core structure. The first core uses maximum binary state codes for generating partial Boolean functions (PBFs). The PBFs of the second core are based on the extended state codes. The proposed approach leads to a LUT-based Mealy FSM where only a part of maximum binary state codes is transformed into extended state codes. Our current research shows that this approach leads to FSM circuits having fewer LUTs compared to FSM circuits based on the twofold state assignment. The experimental results show that FSMs based on our method have practically the same values of the maximum operating frequencies as they are for equivalent FSMs with TSA.
The further text of the article is organized in the following order. The second section shows the background LUT-based Mealy FSM design. The third section discusses the relative works. The main idea of the proposed method is shown in the fourth section. The fifth section includes an example of FSM synthesis using our approach. An algorithm of state redistribution is shown in the sixth section. The seventh section is devoted to results of experiments. The article also includes a short conclusion.

2. Background of LUT-Based Mealy FSMs

A Mealy FSM is characterized by sets of states A, inputs X, outputs Y, state variables T, and input memory functions (IMFs) D [6]. These sets are the following: A = { a 1 , , a M } , X = { x 1 , , x L } , Y = { y 1 , , y N } , T = { T 1 , , T R } , and D = { D 1 , , D R } . So, a Mealy FSM has M states, L inputs, N outputs, R state variables and R input memory functions. The values of the first three parameters are independent of the FSM circuit designer. The value of R can be chosen by a designer. The minimum value of R is determined as
R M B = l o g 2 M .
The Formula (1) determines so-called maximum binary state assignment. The maximum value of R corresponds to so-called one-hot state assignment: R O H = M [20].
The state variables T r T are used for creating state codes K ( a m ) . An input memory function D r D can set up the binary value of the r-th bit of the code K ( a m ) . To keep state codes, a special register RG is used. The RG consists on R flip-flops controlled by two pulses, S t a r t and C l o c k [21]. The pulse S t a r t loads the code K ( a 1 ) of the initial state a 1 A into RG. The synchronization pulse C l o c k allows loading a state code into RG. This code is determined by the values of IMFs. We discuss a case when the RG consists of flip-flops with informational inputs of D type. This is the most popular type of flip-flops using in the FPGA-based design [18].
In this article, we discuss a case when the internal resources of an FPGA chip are used for implementing FSM circuits. These resources include LUTs, flip-flops, programmable interconnections, synchronization tree, programmable input-outputs [22,23]. The LUTs and flip-flops are combined into CLBs.
A LUT is a block having S L inputs and a single output [20,24]. A LUT may implement an arbitrary Boolean function including no more than S L arguments. The value of S L is rather small [22]. If the number of arguments of a Boolean function exceeds S L , then it is necessary to combine together some LUTs. It is quite possible that a function is represented by a multi-CLB circuit. In this case, it is necessary to diminish the number of LUTs and their levels in the corresponding circuit [25,26]. In this article we use the symbol LUTer to show that a corresponding logic blocks includes LUTs, flip-flops and interconnections.
An FSM logic circuit is represented by the following systems of Boolean functions (SBFs) [9]:
D = D ( T , X ) ;
Y = Y ( T , X ) .
The SBF (2) represents the function of transitions, the SBF (3) represents the function of outputs [6]. The SBFs (2) and (3) represent a structural diagram of P Mealy FSM (Figure 1) [6].
In P FSMs, the block L U T e r T is a block of IMFs. This block implements the SBF (2) and loads the next state code into RG. The register RG is distributed among the LUTs included into CLBs of L U T e r T .The flip-flops of RG are controlled by pulses S t a r t and C l o c k . The block L U T e r Y is a block of output logic implementing the SBF (3).
Obviously, the Functions (2) and (3) depend on state variables T r T and FSM inputs x l X . Let a function f j D Y depend on R j R state variables and L j L inputs. If the condition
R j + L j S L
holds, then a corresponding logic circuit consists of a single LUT. If the condition (4) holds for each function f j ( j { 1 , , R + N } , then the FSM circuit includes exactly R + N LUTs. Such a circuit is single-level. This is the best possible solution providing minimum values of the required chip area, power consumption and cycle time (in other words, the maximum value of operating frequency).
However, FSMs can have up to 10 state variables and 30 inputs [6]. At the same time, the modern LUTs have S L = 6 inputs. So, it is quite possible that condition (4) will be violated for at least a single function f j D Y . In this case, it is necessary to use various optimization strategies to optimize the characteristics of an FSM circuit. Our current paper deals with the area reducing problem. Let us analyze some approaches used to solve this problem.

3. Relative Works

Methods for solving this problem can be found in a huge number of scientific papers and monographs [10,21,25,27,28,29,30,31,32,33,34]. In the case of LUT-based devices, the occupied chip area is estimated by the required numbers of LUTs (LUT counts) [10]. To diminish the LUT count, three groups of methods are used: (1) the functional decomposition (FD); (2) the optimal state assignment; (3) the structural decomposition (SD). Methods from different groups can be applied simultaneously [30].
In the case of decomposition, Functions (2) and (3) are represented by systems of partial functions [29,35]. Each partial Boolean function has no more than S L arguments. Due to it, each PBF is represented by a single-LUT circuit. Both FD and SD lead to multi-level FSM circuits. However, these circuits differ in the nature of interconnections [11]. In the case of FD, the resulting circuit has an irregular interconnect structure in which the same variables x l X and T r T appear at different logical levels of the circuit. In the case of SD, an FSM circuit includes from two to four large logic blocks [30]. These blocks have unique systems of inputs and outputs. Due to it, the SD-based FSM circuits have regular systems of interconnections. As shown in the article [11], SD-based circuits have better characteristics compared to equivalent FD-based circuits. In this article, we discuss a way for improvement some SD-based method.
In the case of LUT-based FSMs, a state assignment is optimal if it allows excluding the maximum possible number of literals from the sum-of-products of Functions (2) and (3) [36]. For the possibility of a single-level implementation of an FSM circuit, it is necessary to exclude such amount of literals that condition (4) is satisfied for each function f j D Y . However, this result is possible only for sufficiently simple FSMs [34]. Therefore, in most cases, state encoding methods have an auxiliary nature. If condition (4) is not satisfied after the state assignment, then it is necessary to use other optimization methods.
Very often, the methods of SD are based on finding a partition of the state set A by classes of compatible states. One of such methods is a method of twofold state assignment (TSA) [12,37]. The method is based on construction a partition π A = { A 1 , , A I } of the set A. Each class A i π A determines sets X i , Y i , D i . The set X i X includes L i FSM inputs causing transitions from states a m A i . The set Y i Y consists of FSM outputs produced during the transitions from states a m A i . The set D i D includes input memory functions determining MBCs of transition states.
There are M i states in each class A i π A . Inside each class, these states are encoded by partial maximum binary codes C ( a m ) having R i bits:
R i = l o g 2 ( M i + 1 ) .
To encode states a m A i , the variables v r V i are used. The sets V 1 , , V I form a set V having R A elements:
R A = R 1 + + R I .
A state a m A is compatible with states a s A i , if the including this state into A i does not violate the following condition:
R i + L i S L ( i { 1 , , I } ) .
To optimize the FSM logic circuit, it is necessary to minimize the value of I. This approach leads to the so-called P T Mealy FSM (Figure 2).
In P T Mealy FSMs, each state a m A has two codes. These codes are: (1) the maximum binary state code K ( a m ) and (2) the partial state code C ( a m ) determining a particular state as an element of a particular class. A block L U T e r i corresponds to the class A i π A . This block generates the following systems of PBFs:
D i = D i ( V i , X i ) ;
Y i = Y i ( V i , X i ) .
The LUTerTY creates resulting values of functions f j D Y . Each element of LUTerTY implements the following SBFs:
D r = i = 1 I D r i ( r { 1 , , R } ) .
y n = i = 1 I y n i ( n { 1 , , N } ) .
The block L U T e r T Y contains the flip-flops of RG. The pulses S t a r t and C l o c k enter this block to control the operation of RG.
As follows from (8) and (9), the partial functions depend on state variables v r V i . These state variables are produced by the transformation of the state variables T r T . To transform the codes K ( a m ) , the block L U T e r V generates the following SBF:
V = V ( T ) .
As follows from [37], the circuits of P T FSMs require fewer LUTs than the circuits of equivalent P Mealy FSMs. If the condition
I S L
holds, then the circuits of P T FSMs have exactly three levels of LUTs. As a rule [37], they have higher values of maximum operating frequencies than they are for circuits of equivalent P Mealy FSMs.
We will call the FSM core a block generating partial functions depending on state variables. In P T FSMs, there is the C o r e V consisting of blocks L U T e r 1 - L U T e r I . All other functions are generated by a function assembly block (FAB). In P T FSMs, the FAB consists of blocks L U T e r T Y and L U T e r V . Using this terminology, we can represent the structural diagram of P T FSM in its generalized form (Figure 3).
As follows from Figure 3, all PBFs depend on both inputs x l X and state variables v r V . So, the transformation K ( a m ) into C ( a m ) is executed for all states a m A . However, if condition (4) is satisfied for some state a m A , then there is no need for the code transformation noted above. If we reduce the number of states whose codes are transformed, then it is possible to reduce both the number of classes (I) and the value of the parameter R A . This is an approach proposed in our current paper.

4. Main Idea of the Proposed Method

The transitions from a state a m A depend on FSM inputs from a set X ( a m ) X . This set includes L ( a m ) L elements. Let the following condition hold:
L ( a m ) + R M B S L .
If the condition (14) takes place, then each PBF generated during the transitions from a m A is represented by a single-LUT circuit. So, there is no need in the partial codes for such states a m A . So, the partial codes C ( a m ) should be generated only for states for which the condition (14) is violated. This conclusion is the basis for a method proposed in this article.
We propose to divide the set A by sets A M B and A P C . If the condition (14) holds for a state a m A , then this state is included into the set A M B . Otherwise, this state is included into the set A P C . The states a m A M B form a core denoted as a C o r e T , whereas the states a m A P C form a core denoted as a C o r e V . The transformation of state codes is executed only for the states a m A P C .
The C o r e T determines the sets X T X , Y T Y 0 Y , and D 0 D . The first set includes FSM inputs determining the transitions from the states a m A M B . The second set consists of FSM outputs produced during the transitions from these states. The outputs from the set Y T are produced only during transitions from the states of the C o r e T . The outputs from the set Y 0 are shared between both cores. The third set includes IMFs generated during the transitions from the states a m A M B . The following SBFs determine the C o r e T :
D 0 = D 0 ( T , X T ) ;
Y 0 = Y 0 ( T , X T ) ;
Y T = Y T ( T , X T ) .
The C o r e V determines the sets X V X and Y V Y . The first set includes FSM inputs determining the transitions from the states a m A P C . The second set consists of FSM outputs produced during the transitions from these states. The following SBFs determine the C o r e T :
D V k = D V k ( V k , X V k ) ;
Y V k = Y V k ( V k , X V k ) .
The C o r e V is based on the partition π V = { A 1 , , A K } of the set A P C . This partition is constructed in the same way as the partition π A . Each class of the partition π V determines the sets X V k , Y V k , V k and D V k . These sets are similar to the corresponding sets of partial functions considered for the partition π A . The circuit of C o r e V is determined by SBFs similar to SBFs (8) and (9). These SBFs are the following:
D = D ( T , X V ) ;
Y V = Y V ( T , X V ) .
To generate the outputs y n Y V and state variables, it is necessary to use FAB. We propose to combine together the blocks FAB, C o r e V , and C o r e T . The proposed connection of blocks leads to a double-core FSM P 2 C . Its generalized structural diagram is shown in Figure 4.
In Mealy FSM P 2 C , the block C o r e T implements SBFs (15) and (17). The block C o r e V generates functions from SBFs (18) and (19). The block FAB includes two blocks, L U T e r T Y and L U T e r V . The block L U T e r T Y transforms functions (15) and (16), (18) and (19) into resulting values of functions (20) and (21). The block L U T e r V implements SBF (12).
There are K classes in the partition π V . The following condition holds:
K I .
Then, replacing the subscript i by subscript k turns the Formula (5) into a formula determining the number of state variables in the codes C ( a m ) for states a m A k . Having these values allows obtaining the total number of variables v r V :
R V = R 1 + R 2 + + R K .
Obviously, the following condition takes place:
R V R A .
Due to the validity of condition (22), the following is true: (1) the circuit of C o r e V for FSM P 2 C must include fewer LUTs than this circuit for the equivalent FSM P T and (2) the circuit of FSM P 2 C must include no more levels of logic than it is for the circuit for the equivalent FSM P T . Both P T and P 2 C FSMs incorporate the block L U T e r V executing the transformation of state codes. Obviously, the fewer LUTs has included in the circuit of this block, the less power it consumes. As follows from the validity of condition (24), the circuit of L U T e r V for FSM P 2 C must include fewer LUTs than this circuit for the equivalent FSM P T . Therefore, the block L U T e r V of P 2 C FSM has less static power consumption than this block of equivalent FSM P T . Since some PBFs are generated by the block C o r e T , then in some cycles of FSM operation the elements LUTs of the block L U T e r V do not change their states. So, in these cycles, the block L U T e r V has the dynamic power consumption close to zero. This analysis suggests that the block L U T e r V of P 2 C FSM has less power consumption than that block of an equivalent FSM P T .
So, we assume that the circuits of Mealy FSMs P 2 C will have fewer LUTs and almost the same or even faster performance compared to circuits of equivalent FSMs P T . We can also argue that P 2 C FSMs require less energy for the code transformation than equivalent FSMs P T . However, only the experimental studies can show the real energy budgets of equivalent P T and P 2 C FSMs.
Using the above information, we propose a method for synthesis of LUT-based P 2 C Mealy FSMs. As the initial form of FSM representation we use state transition graphs (STGs) [9]. Next, we transform this STG in an equivalent state transition table (STT) [9]. To implement an FSM circuit, we use LUTs having S L inputs. The proposed method includes the following steps:
  • Transforming the initial STG into STT of P Mealy FSM.
  • Preliminary constructing sets A M B and A P C .
  • Preliminary constructing the partition π V of the set A P C .
  • Redistribution of states between sets A M B , A P C and π V .
  • Encoding of FSM states by maximum binary codes K ( a m ) .
  • Creating table of the block C o r e T and SBFs (15)–(17).
  • Encoding states a m A k by partial state codes C ( a m ) .
  • Creating tables of blocks from C o r e V and SBFs (18) and (19).
  • Creating table of L U T e r T Y and SBFs (20) and (21).
  • Creating table of L U T e r V and SBF (12).
  • Implementing P 2 C Mealy FSM circuit using internal resources of a chip.
We use a symbol P 2 C ( S ) to show that the model of P 2 C FSM is used to implement the logic circuit of some FSM S. In the next section, we discuss an example of synthesis of P 2 C Mealy FSM, where we explain how each step is executed.

5. Example of Synthesis

We discuss a case of P 2 C ( S 1 ) FSM synthesis using LUTs with S L = 5 . The FSM S 1 is represented by an STG shown in Figure 5.
Each node of an STG corresponds to the FSM state. Each arc of an STG corresponds to an interstate transition [9]. There are H arcs in an STG. The h-th arc is marked by a pair <input signal X h , collection of outputs Y h >. An input signal X h is a conjunction of FSM inputs x l X determining the h-th interstate transition. A collection of outputs Y h Y includes FSM outputs y n Y generating during the h-th interstate transition.
So, the FSM S 1 is characterised by the following sets: A = { a 1 , , a 9 } , X = { x 1 , , x 7 } and Y = { y 1 , , y 8 } . This gives the following values: M = 9 , L = 7 , and N = 8 . As follows from Figure 5, there is H = 21 .
Step 1. This step is executed in the trivial way [6]. Each arc of the STG corresponds to a single line of a corresponding STT. So, this table has the columns a m , a s , X h , Y h , h. The state a m corresponds to a vertex from which the h-th arc comes out (this is a current state); the state a s corresponds to a vertex into which this arc enters (this is a state of transition). The column X h includes the input signal written above the h-th arc. The column Y h includes the collection of outputs written above the h-th arc. Using this approach transforms the STG (Figure 5) into the equivalent STT (Table 1).
Step 2. To divide the set A by sets A M B and A P C , it is necessary to find values of L ( a m ) for states a m A . The following values can be found from Table 1: L ( a 4 ) = 0 ; L ( a m ) = 1 for states a 1 , a 3 , a 6 , a 7 ; L ( a m ) = 2 for states a 2 , a 5 , a 8 , a 9 . There is S L = 5 . As follows from (14), there are the sets A M B = { a 1 , a 3 , a 4 , a 6 , a 7 } and A P C = { a 2 , a 5 , a 8 , a 9 } . As we show in the next section, some elements of the set A M B can be transferred to the set A P C . Thus, these sets do not yet have a final form. Now, we can find sets X T and X V . The set X T includes inputs determining transitions from states a m A M B , the set X V includes inputs determining transitions from states a m A P C . In the discussed case, there are the following sets: X T = { x 1 , x 2 , x 3 } and X V = { x 2 , x 3 , x 5 , x 6 , x 7 } .
Step 3. Using approach [12] gives the partition π V = { A 1 , A 2 } of the set A P C . The classes of this partition are the following: A 1 = { a 2 , a 5 } and A 2 = { a 8 , a 9 } . This gives the following values of M k : M 1 = M 2 = 2 . Using (5) gives R 1 = R 2 = 2 and R V = 4 . Since the set A P C can be changed, the partition π V is also preliminary.
Step 4. We discuss this step in Section 6. Now, we only show the outcome of this step. It is the following: A M B = { a 1 , a 3 , a 4 } and A P C = { a 2 , a 5 , a 6 , a 7 , a 8 , a 9 } . Now, the classes of π V = { A 1 , A 2 } are the following: A 1 = { a 2 , a 5 , a 7 } and A 2 = { a 6 , a 8 , a 9 } . This gives the following values of M k : M 1 = M 2 = 3 . Using (5) gives R 1 = R 2 = 2 and R V = 4 . So, there is no change in the total number of state variables v r V before and after refining the sets A M B and A P C . So, there is the set V = { v 1 , , v 4 } . However, now there are fewer states in the set A M B . This means that the number of LUTs in the circuit of C o r e T should be reduced compared to this number corresponding to the set A M B obtained during the Step 2.
Step 5. There is M = 9 . Using (1) gives R M B = 4 . So, there are the following sets: T = { T 1 , , T 4 } and D = { D 1 , , D 4 } . To minimize the sum-of-products (SOPs) of functions (12), it is necessary to place the states from the same class into minimum possible amount of generalized cubes of R M B -dimensional Boolean space [9]. Let us encode the states in a way shown in Figure 6.
As follows from Figure 6, the states a m A M B are placed into the cube 00xx. This allows optimizing SOPs of functions (15)–(17). The states a m A 1 are placed in the cube x100, the states a m A 2 are placed in the cube 1x00. This gives the opportunity to optimize SOPs of functions (12).
Step 6. The table of C o r e T is constructed using the lines 1–2 and 6–8 of Table 1. Three more columns are added in this table: K ( a m ) , K ( a s ) and D h 0 . The first and second additional columns include the codes of current and next states, respectively. The column D h 0 includes IMFs equal to 1 to load the code K ( a s ) into the RG. We changed the names for columns X h and Y h compared to Table 1. Now we use the notation X h 0 and Y h 0 . The C o r e T is represented by Table 2.
Using Table 2 gives the following SBFs:
D 1 0 = F 3 0 = T 1 ¯ T 2 ¯ T 4 x 1 ; D 2 0 = F 1 0 = T 1 ¯ T 2 ¯ T 3 ¯ T 4 ¯ x 1 ; D 4 0 = F 2 0 = T 1 ¯ T 2 ¯ T 3 ¯ T 4 ¯ x 1 ¯ ; y 1 0 = T 1 ¯ T 2 ¯ T 3 ¯ T 4 ¯ x 1 T 1 ¯ T 2 ¯ T 3 T 4 ¯ ; y 4 0 = [ F 2 0 F 4 0 ] F 5 0 = T 1 ¯ T 2 ¯ T 3 ¯ x 1 ¯ T 1 ¯ T 2 ¯ T 3 ; y 5 0 = [ F 1 0 F 3 0 ] = T 1 ¯ T 2 ¯ T 3 ¯ x 1 ; y 8 0 = T 1 ¯ T 2 ¯ T 4 x 1 .
This system is used to create the circuit of C o r e T . Let us point out that the function y 4 is generated only by some LUT of C o r e T . This gives Y T = { y 4 } . Furthermore, the following sets can be derived from Table 2: X 0 = { x 1 } , Y 0 = { y 1 , y 5 , y 8 } and D 0 = { D 1 , D 2 , D 4 } .
Step 7. To encode the states a m A 1 , the variables v 1 , v 2 V are used. To encode the states a m A 2 , the variables v 3 , v 4 V are used. We use the code 00xx to show that a particular state does not belong to the class A 1 . The code xx00 shows that a particular state does not belong to the class A 2 . The outcome of state assignment is shown in Figure 7.
The following partial codes can be found from the Karnaugh map (Figure 7): C ( a 2 ) = C ( a 6 ) = 01 , C ( a 5 ) = C ( a 8 ) = 10 , and C ( a 7 ) = C ( a 9 ) = 11 . These codes are used in LUTs of C o r e V .
Step 8. There are two blocks of LUTs in the C o r e V . The block L U T e r 1 implements SBFs for the class A 1 ; the block L U T e r 2 implements SBFs for the class A 2 . The table of L U T e r 1 is constructed using the lines 3–5, 9–11 and 14–15 of Table 1. This is Table 3. The table of L U T e r 2 is constructed using the lines 12–13 and 16–21 of Table 1. This is Table 4.
Both tables use partial state codes C ( a m ) for current states and the MBCs K ( a s ) for states of transition. The following sets can be found from Table 3: X 1 = { x 2 , x 5 , x 6 } , Y 1 = { y 1 , y 2 , y 5 , y 6 , y 8 } and D 1 = D . The following sets can be found from Table 4: X 2 = { x 3 , x 5 , x 7 } , Y 2 = { y 1 , y 3 , y 5 , y 7 , y 8 } and D 2 = D .
The SBFs (18) and (19) are constructed in the same way as this is for SBFs (15)–(17). For example, the following SOPs can be obtained for functions D 1 1 (Table 3) and D 1 2 (Table 4):
D 1 1 = F 2 1 F 8 1 = v 1 ¯ v 2 x 2 ¯ x 5 v 1 v 2 x 2 ¯ ; D 1 2 = F 5 2 F 8 2 = v 3 v 4 ¯ x 3 ¯ x 7 ¯ v 3 v 4 x 5 ¯ x 7 ¯ .
Step 9. There are the following columns in table of L U T e r T Y : f j (a function generated by L U T e r T Y ), C o r e T , C o r e V . If a function f j D Y is generated by a LUT of C o r e T , then there is 1 in the intersection of the line with this function and the column of the corresponding core. Otherwise, this intersection is marked by 0. There are K sub-columns in the column C o r e V . If a function f j D Y is generated by L U T e r k of C o r e V , then there is 1 in the intersection of the line with this function and the sub-column k. In the discussed case, the block L U T e r T Y is represented by Table 5.
To fill the column C o r e T , the data from Table 2 are used. To fill the sub-column 1, we use Table 3. Table 4 is a base for filling the sub-column 2. We hope there is a transparent connection between Table 2, Table 3, Table 4 and Table 5.
Using Table 5, we can construct the following SBFs:
D 1 = D 1 0 D 1 1 D 1 2 ; D 2 = D 2 0 D 2 1 D 2 2 ; D 3 = D 3 1 D 3 2 ; D 4 = D 4 0 D 4 1 D 4 2 ; y 1 = y 1 0 y 1 1 y 1 2 y 2 = y 2 1 ; y 3 = y 3 2 ; y 4 = y 4 0 ; y 5 = y 5 0 y 5 1 y 5 2 ; y 6 = y 6 1 ; y 7 = y 7 2 ; y 8 = y 8 0 y 8 1 y 8 2 .
Each function f j D Y is represented by a disjunction of its partial components. The principle of constructing each function of (27) is clear from the comparison of these functions with contents of Table 5.
Step 10. To create the table of L U T e r V , we should use the full codes K ( a m ) and partial state codes C ( a m ) . So, there are the following columns in this table: a m , K ( a m ) , C ( a m ) , V m . Inside this table, we use only states a m A P C . In the discussed case, there are six lines in the table of L U T e r V (Table 6).
To fill the column K ( a m ) , we use the state codes from Figure 6. The column C ( a m ) is filled using the partial state codes from Figure 7.
To optimize the SBF (12), we represent its functions by the Karnaugh map (Figure 8). In this map, we treat the codes of states a m A M B as the “don’t care” input assignment.
Using the Karnaugh map (Figure 8) gives the following SBF:
v 1 = T 1 ¯ T 4 T 1 ¯ T 3 ; v 2 = T 1 ¯ T 4 ¯ ; v 3 = T 1 T 4 T 1 T 3 ; v 4 = T 1 T 4 ¯ .
In the worst case, each function v r V is represented by a SOP having R M B literals. So, the maximum number of literals is calculated as the product of R V by R M B . In the discussed case, this number is equal to 16. If we analyze the SBF (28), we find that it includes 10 literals. So, using our approach allows reducing the number of literals by a factor of 1.6. Each literal corresponds to an interconnection between outputs of RG and inputs of LUTs creating the circuit of L U T e r V . It is known that minimizing the number of interconnections allows reducing the value of power consumption [26,38].
Step 11. To implement the circuit of P 2 C Mealy FSM, it is necessary to use, for example, the CAD tool Vivado by Xilinx [39]. This package solves all problems connected with the step of technology mapping [40,41]. In Section 7, we use Vivado to compare the proposed method with some known FSM design methods.

6. Algorithm of State Redistribution

If a class A k π V includes M k states, then it is necessary R k state variables to encode the states a m A k by the partial state codes C ( a m ) . The value of R k is determined by (5). We denote as M N P k the maximum possible number of states in a class A k π V . This value is determined as
M N P k = 2 R k 1 .
Our research shows that it is quite possible that some class A k π V includes fewer states compared to the value of M N P k . For example, we have the following classes for FSM S 1 : A 1 = { a 2 , a 5 } and A 2 = { a 8 , a 9 } . Using (5) gives R 1 = R 2 = 2 . Using (29) gives M N P 1 = M N P 2 = 3 . So, both classes might be supplemented by states from the set A M B = { a 1 , a 3 , a 4 , a 6 , a 7 } . One state can be added to each of the classes A k π V .
So, it is quite possible that we need to redistribute states between sets A M B and A P C . Obviously, these new elements of A P C should be added into some classes A k π V . It is obvious that it is expedient to transfer states in such a way as to reduce the number of states in the set A M B as much as possible.
We propose to use an estimate I ( a m ) , which we called the influence of the state a m A M B on the sets X T and X V . In the discussed case, these sets are the following: X T = { x 1 , x 2 , x 3 } and X V = { x 2 , x 3 , x 5 , x 6 , x 7 } .
The best candidate for transfer to the set A k π V is the state a m A M B that minimizes the number of inputs in the set X T and minimally increases this number in the set X k . The influence of a state a m A M B on the set X T is determined as
I T ( a m ) = | X ( a m ) X T | .
The influence of a state a m A M B on the set X k is determined as
I V ( a m ) = | X ( a m ) X k | .
So, the overall influence of the state a m A M B is defined as
I ( a m ) = I T ( a m ) I V ( a m ) .
Obviously, it is necessary to transfer the states with the greatest influence. This is the basis of our proposed redistribution algorithm (Figure 9).
During the redistribution, a queue γ k is formed from the states a m A M B . This queue is based on the following rule: the states are placed as the value of I ( a m ) decreases. If the influence is the same for states a m , a s A M B ( I ( a m ) = I ( a s ) ) , then, in the queue, the state with lower subscript precedes a state with higher subscript. A state can be included into a class A k π V , if its including does not violate the condition (4). In our algorithm, we use the abbreviation CBI (can be included). For each class A k π V , the queue γ k includes J k elements. This preliminary information is quite enough to proceed to the description of the proposed algorithm.
We start the redistribution from the testing the set A M B (Block 1). If this set is empty (output 1), then the redistribution cannot be executed. If there are some states in the set A M B (output 0), then the redistribution process begins. The analysis starts with class A 1 π V (Block 2). If the analyzed class includes the maximum number of states (output 1 from Block 3), then it is necessary to proceed to the analysis of the next class (go to Block 15). The algorithm is terminated when all classes are analyzed (output 1 of Block 16). Otherwise, the next class is analyzed (go to from Block 16 to Block 3).
If an additional state can be included in the class A k π V (output 0 from Block 3), then there is created a queue γ k having J k elements (Block 4). Next, the sequential analysis of the states from the queue γ k is performed. The analysis starts from the first element of the queue (Block 5).
The j-th element is taken from the queue (Block 6). If it cannot be included into the class A k π V (output 0 from Block 7), then the next element of the queue should be analyzed (go to Block 13). If all elements are analyzed (output 1 of Block 14), then it is necessary to analyze the class A k + 1 π V (go to Block 15). Otherwise (output 0 of Block 14), the next element of the queue is analyzed (go to Block 6).
If the j-th element can be included into the class A k π V (output 1 from Block 7), then the following actions are executed (Block 8): (1) the state a j A M B is included into the set A k π V ; (2) the state a j A M B is excluded from the set A M B . If now (after excluding state a j A M B ) the set A M B becomes empty (output 1 of Block 9), the redistribution process is terminated (go to End). Otherwise (output 0 of Block 9), the next element of queue should be analyzed (go to Block 10). If all elements are already analyzed (output 1 of Block 11), then it is necessary to analyze the class A k + 1 π V (go to Block 15). Otherwise (output 0 of Block 11), the next element of queue should be analyzed. This can be done if the class A k π V does not contain the maximum possible number of elements. This is checked in the Block 12. If the class is full (output 1 of Block 12), then it is necessary to analyze the class A k + 1 π V (go to Block 15). Otherwise (output 0 of Block 12), the next element of the queue is analyzed (go to Block 6).
There are two conditions to terminate this redistribution process. First, if there are no elements in the set A M B (outputs 1 from Blocks 1 and 9). Second, all classes A k π V have been tested and, if it was possible, supplemented by states a m A M B (output 1 from Block 16).
So, the k-th step of the redistribution process starts from creating current sets A M B and X 0 . Next, it is necessary to find the values of I ( a m ) for states a m A M B and create the current queue γ k . So, there are K columns corresponding to classes A k π V in the table of redistribution. Each column is divided by the following sub-columns: A M B , I ( a m ) , γ k , j = 1 , j = 2 , , j = J k . In this table, the line a m includes states a m A M B transferred in the particular class A k π V . The lines for these states are marked by ⊕. If a state cannot be included into the class A k π V , the corresponding line includes the sign “−”. The last line of the table contains the classes A k π V . Table 7 shows the redistribution process for FSM S 1 .
Let us go back to the previous section. After executing the step 2, we have the following sets: A M B = { a 1 , a 3 , a 4 , a 6 , a 7 } , A P C = { a 2 , a 5 , a 8 , a 9 } , X T = { x 1 , x 2 , x 3 } and X V = { x 2 , x 3 , x 5 , x 6 , x 7 } . After executing the step 3, we have the partition π V = { A 1 , A 2 } with the following classes: A 1 = { a 2 , a 5 } and A 2 = { a 8 , a 9 } . These classes are characterized by the sets X 1 = { x 2 , x 5 , x 6 } and X 2 = { x 3 , x 5 , x 7 } .
So, for k = 1 , the column A M B contains the states a 1 , a 3 , a 4 , a 6 , a 7 . For the state a 1 A M B , we can find the set X ( a 1 ) = { x 1 } . Let us find the value of I ( a 1 ) . Using (30) gives the following: I T ( a 1 ) = | X ( a 1 ) X T | = | { x 1 } { x 1 , x 2 , x 3 } | = 0 . Using (31) gives I V ( a 1 ) = | X ( a 1 ) X 1 | = | { x 1 } { x 2 , x 5 , x 6 } | = 1 . Using (32) gives I ( a 1 ) = I T ( a 1 ) I V ( a 1 ) = 0 1 = 1 . This value is written in the intersection of the line a 1 and sub-column I ( a m ) for k = 1 . In the same way, the values of I ( a m ) for all other states a m A M B are calculated.
Using the values of I ( a m ) , we can get the queue γ 1 = < a 7 , a 4 , a 6 , a 1 , a 3 > . In the intersection of the line a m and the sub-column γ 1 , there is written the place of this state in this queue. So, we should check the possibility of redistribution starting from the state a 7 . If we place the state a 7 into the class A 1 , then there is no change for values of L 1 and R 1 . So, the state is included into A 1 and excluded from A M B . Now, there is M 1 = M P N 1 = 3 . So, during the step j = 2 no state can be added into the class A 1 .
Now, there are the following modified sets: A 1 = { a 2 , a 5 , a 7 } , A M B = { a 1 , a 3 , a 4 , a 6 } and X T = { x 1 , x 3 } . Using the modified sets A M B and X T , we can start the next step of redistribution ( k = 2 ).
The values of I ( a m ) are shown in the corresponding sub-column of the column k = 2 . Using them gives the queue γ 2 = < a 6 , a 4 , a 1 , a 3 > . If we place the state a 6 into the class A 2 , then there is no change for values of L 2 and R 2 . So, the state a 6 is included into A 2 and excluded from A M B . Now, there is M 2 = M P N 2 = 3 . So, during the step j = 2 no state can be added into the class A 2 . So, the class A 2 is ready.
Now, there are the following modified sets: A 1 = { a 2 , a 5 , a 7 } , A 2 = { a 6 , a 8 , a 9 } , A M B = { a 1 , a 3 , a 4 } and X T = { x 1 } . Obviously, these sets are the same as we use as the outcome of Step 4 in our example.

7. Experimental Results

In this section, the results of experiments conducted with the benchmarks [42] are shown. The library [42] consists of 48 benchmarks. The benchmark FSMs are represented by their STTs. To represent the STTs, the format KISS2 is used. These benchmarks have a wide range of basic characteristics (numbers of states, inputs, and outputs). Different researchers use these benchmarks to compare various characteristics of FSM circuits [28,29,32]. The characteristics of benchmarks are shown in Table 8.
Our current research is connected with Mealy FSMs which are the parts of digital systems. It is known that Mealy FSMs are not stable [6],: fluctuations at the inputs lead to fluctuations at the outputs. This can lead to errors in the operation of the digital system as a whole. To avoid these errors, the FSM inputs should be stabilized. The stabilization presumes using an additional input register (AIR) [30]. When input values stabilize, they are loaded into the AIR. Now, fluctuations at the inputs (which are the outputs of some system’s blocks) do not lead to fluctuations at the FSM outputs. However, the AIR consumes some resources of a chip: (1) it requires L additional LUTs and flip-flops and (2) it is synchronized (due to it, AIR uses some resources of the synchronization tree). So, this register consumes additional LUTs, flip-flops, power and time (it adds some delay to the whole synchronization cycle time). Such an approach allows taking into account this overhead connected with the stabilization of FSM operation.
The experiments are conducted using a personal computer with the following characteristics: CPU: Intel Core i5-11300H, Memory: 16GB RAM LPDDR4X. To get the FSM circuits, we use the Virtex-7 VC709 Evaluation Platform (xc7vx690tffg1761-2) [43] by AMD Xilinx. There is S L = 6 for LUTs used in this platform includes. The CAD tool Vivado v2019.1 (64-bit) [39] executes the technology mapping. The results of experiments are taken from reports produced by Vivado. To connect the library with Vivado, we use VHDL-based FSM models. These models are obtained by a transformation of the files in KISS2 format into VHDL codes. The transformation is executed by the CAD tool K2F [30].
We have found three main characteristics of P 2 C Mealy FSMs. They are: the occupied chip area (the LUT count), performance (both the values of cycle time and maximum operating frequency), and power consumption. We compared the obtained values with the corresponding values for four different FSMs. Three of them are P Mealy FSMs based on: (1) Auto of Vivado (it uses MBCs); (2) One-hot of Vivado; (3) JEDI (it uses MBCs, too). Moreover, for the comparison, we use P T -based FSMs [12] whose circuits we try to improve.
As shown in [30], all main characteristics of LUT-based FSM circuits depend on the relation between the values of L + R M B , on the one hand, and the value of S L , on the other hand:
n S L < ( L + R M B ) ( n + 1 ) S L .
Analysis of Table 8 allows dividing the benchmarks into five sets. The benchmarks belong to class of trivial FSMs (set 0), if n = 0 (it gives R M B + L 6 ). I The benchmarks belong to set of simple FSMs (set 1), if n = 1 (it gives R M B + L 12 ). The benchmarks belong to set of average FSMs (set 2), if n = 2 (it gives R M B + L 18 ). The benchmarks belong to set of big FSMs (set 3), if n = 3 (it gives R M B + L 24 ). The benchmarks belong to set of very big FSMs (set 4), if n = 4 (it gives the relation R M B + L > 24 ). As research [37] shows, the larger the set number, the bigger the gain from using methods of twofold state assignment.
The results of experiments are shown in Table 9, Table 10 and Table 11. These tables are organized in the same manner. The table columns are marked by the names of investigated methods. The last column includes the number of the benchmark set to whom the particular benchmark belongs. The table rows are marked the names of benchmarks. There are results of summation of values from columns in the row “Total”. The row “Percentage” includes the percentage of summarized characteristics of FSM circuits produced by other methods respectively to P 2 C -based FSMs. We start the analysis of experiments from Table 9. This table contains the values of LUT counts for each benchmark used in the experiments.
As follows from Table 9, the circuits of P 2 C -based FSMs use a minimum number of LUTs compared to other investigated methods. There is the following gain: (1) 36.92% compared to Auto-based FSMs; (2) 56.23% compared to One-hot–based FSMs; (3) 16.11% compared to JEDI-based FSMs; (4) 5.74% compared to P T -based FSMs. In our opinion, this gain is associated with a decrease in the number of variables used in partial state codes (compared to equivalent P T -based FSMs). The second source of a decrease in the LUT counts can be a decrease in the number of partition classes. If the relation ( K + 1 ) < I takes place, then there is a decrease in the required number of LUT inputs for elements of L U T e r T Y . If the condition (13) is violated but the condition ( K + 1 ) < S L holds, then the circuit of L U T e r T Y is multi-level for a P T -based FSM as opposed to the single-level block circuit of an equivalent P 2 C -based FSM.
Careful analysis of the table reveals the following feature of the proposed method: there are the same values of LUT counts for equivalent P T - and P 2 C -based FSMs for the Set 0. This can be explained as follows. For this set, the condition (14) holds. This means that each function f j D Y does not require being decomposed. Only a single LUT is enough to implement a logic circuit for any function f j D Y . In this case, there is the same single class into both partitions, π A and π V . Due to it, the block FAB is absent. This means that both P T and P 2 C FSMs turn into P FSMs. So, there are the same circuits for P T and P 2 C FSMs. Obviously, these circuits have the same values of LUT counts. The same should take place also for other characteristics of these two models.
Furthermore, from Table 9 we see that the values of LUT counts are the same for some equivalent P T and P 2 C FSMs that do not belong to the set 0. This phenomenon occurs for the following benchmarks: dk16, ex1, planet, planet1, s1488, s1494, s1a, s420, s510, s810, s832, sand and styr. Analysis of Table 8 reveals the nature of this phenomenon: there are more than S L = 6 bits in state codes for these FSMs. This means that the following condition holds:
R M B > S L .
In this case, the condition (14) is violated. This leads to the empty set A M B . In turn, this makes correct the following relations: A P C = A and π A = π V . So, if the condition (34) holds, then P 2 C FSMs turn into P T FSMs. Obviously, there are the same LUT counts for such equivalent P 2 C and P T FSMs.
As follows from Table 10, the circuits of P 2 C -based FSMs are the fastest compared to the circuits produced by other investigated methods. There is the following gain: (1) 14.60% compared to Auto-based FSMs; (2) 14.89% compared to One-hot–based FSMs; (3) 8.88% compared to JEDI-based FSMs; (4) 5.46% compared to P T -based FSMs. We think that this gain is due to the fact that in some cases the circuits of P 2 C -based FSMs have fewer levels of LUTs than the circuits of P T -based FSMs. We discussed the reasons for this phenomenon in the analysis of Table 9. It is interesting to note that the average gain in the cycle time almost coincides with the average gain in the LUT counts (for P T - and P 2 C -based FSMs).
As follows from Table 10, for the Set 0, there are the same values of cycle times for equivalent benchmarks using models of single-core and dual-core FSMs. The explanation is the same as it is for the equality of LUT counts. Moreover, from Table 10 we can find out that the temporal characteristics are the same for the following benchmarks: dk16, ex1, planet, planet1, s1488, s1494, s1a, s420, s510, s810, s832, sand and styr. The reasons for this phenomenon have also been analyzed in the previous paragraphs.
Using values of cycle times, we can trivially compute the values of maximum operating frequencies. These values are shown in Table 11.
As follows from Table 11, the circuits of P 2 C -based FSMs have the highest values of maximum operating frequencies compared to the circuits based on other investigated methods. There is the following gain: (1) 12.26% compared to Auto-based FSMs; (2) 12.40% compared to One-hot–based FSMs; (3) 7.09% compared to JEDI-based FSMs; (4) 5.42% compared to equivalent P T -based FSMs. Obviously, the gain in frequency is related to the gain in cycle time. We discussed all the reasons for this phenomenon above.
The value of power consumption is one of the most important characteristics of FSM circuits [44]. Very often, the gain in area-temporal characteristics is accompanied with an increase in the power consumption [27]. Using Vivado reports allows constructing Table 12 with values of consumed power.
The main goal of the proposed method is to obtain FSM circuits with fewer LUTs than it is in circuits of equivalent P T -based FSMs. Of course, this improvement can lead to an increase in power consumption. As follows from Table 12, this increase is extremely small. Compared to P T -based FSMs, the circuits of equivalent P 2 C -based FSMs consume less than one percent more power (0.76%). If compare P 2 C -based FSMs with other investigated methods, then there is the following gain: (1) 16.38% compared to Auto-based FSMs; (2) 24.02% compared to One-hot–based FSMs; (3) 1.90% compared to JEDI-based FSMs.
We associate this loss with the following. In P T -based FSMs, the state variables T r T are connected only with the block L U T e r V . However, in P 2 C -based FSMs, these variables are connected with LUTs of both L U T e r V and C o r e T . This increase in the number of connections leads to an increase in the value of parasitic capacitance in an FSM circuit [26]. Due to it, P 2 C -based FSMs consume more power than equivalent P T -based FSMs. Obviously, this phenomenon does not occur for FSMs from the Set 0. Moreover, for the benchmarks dk16, ex1, planet, planet1, s1488, s1494, s1a, s420, s510, s810, s832, sand and styr both P T - and P 2 C -based FSMs consume equal values of power.
So, the proposed method allows obtaining circuits with either better or the same values of area-temporal characteristics than they are for equivalent P T -based FSMs. Our main purpose is to get the FSM circuits with fewer LUTs than it is for equivalent P T -based FSMs. As follows from the conducted experiments, this goal has been achieved. Furthermore, the proposed method has an additional positive effect: it allows getting faster FSM circuits than the circuits of equivalent P T -based FSMs. Our method loses slightly in terms of the amount of power consumed. However, this loss does not exceed 1% on average. We think that our approach can be used instead of P T FSMs if area-temporal characteristics determine the optimality of the resulting FSM circuits.

8. Conclusions

Modern FPGAs are very powerful design tools [45]. Nowadays, a single FPGA chip may implement a very complicated digital system. The main drawback of FPGAs is a very small number of LUT inputs [19,46]. This complicates the problem of optimizing the FSM circuits representing sequential blocks of digital systems. Very often, the process of technology mapping for such FSMs is connected with applying various functional decomposition methods. In this case, the resulting LUT-based FSM circuits are multi-level.
The technology mapping can be based on applying various methods of structural decomposition [30]. The research results shown in [11] prove that, very often, the SD leads to FSM circuits with significantly better characteristics compared to their counterparts based on the FD. Our research [12] shows that single-core circuits with the twofold state assignment have better characteristics compared to their FD-based counterparts. However, this approach is connected with using a special transformer creating the extended state codes. This transformer consumes some resources of FPGA chip used.
In our current article, we propose to use two cores generating systems of partial Boolean functions. This leads to P 2 C Mealy FSMs where different systems of state variables are used in different cores. Our approach allows reducing LUT counts and improving temporal characteristics in comparison with PT-based FSMs. Note that this gain is associated with a very slight increase in the power consumption (up to 1% on average).
In our future research, we will try to use this approach to optimize Mealy FSM circuits based on various structural decomposition methods. We will also check the possibility of using the double-core approach for optimizing the circuits of LUT-based Moore FSMs. We hope these methods can be used for implementing sequential devices of modern embedded systems.

Author Contributions

Conceptualization, A.B., L.T. and K.K.; methodology, A.B., L.T. and K.K.; formal analysis, A.B., L.T. and K.K.; writing—original draft preparation, A.B., L.T. and K.K.; supervision, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIRadditional input register
CADcomputer aided design
CLBconfigurable logic block
ESCextended state codes
FABfunction assembly block
FDfunctional decomposition
FPGAfield-programmable gate array
FSMfinite state machine
IMFinput memory function
LUTlook-up table
MBCmaximum binary state codes
PBFpartial Boolean functions
SBFsystem of Boolean functions
SDstructural decomposition
SOPsum of products
STGstate transitions graph
STTstate transition table
TSAtwofold state assignment

References

  1. Alur, R. Principles of Cyber-Physical Systems; MIT Press: Cambridge, MA, USA, 2015. [Google Scholar]
  2. Lee, E.A.; Seshia, S.A. Introduction to Embedded Systems: A Cyber-Physical Systems Approach; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  3. Marwedel, P. Embedded System Design: Embedded Systems Foundations of Cyber-Physical Systems, and the Internet of Things, 3rd ed.; Springer International Publishing: Cham, Switzerland, 2018. [Google Scholar]
  4. Gajski, D.D.; Abdi, S.; Gerstlauer, A.; Schirner, G. Embedded System Design: Modeling, Synthesis and Verification; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  5. Gazi, O.; Arli, A. State Machines Using VHDL: FPGA Implementation of Serial Communication and Display Protocols; Springer: Berlin, Germany, 2021; p. 326. [Google Scholar]
  6. Baranov, S. Logic Synthesis of Control Automata; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1994. [Google Scholar]
  7. Czerwinski, R.; Kania, D. Finite State Machine Logic Synthesis for Complex Programmable Logic Devices; Vol. 231 of Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  8. Baranov, S. Finite State Machines and Algorithmic State Machines: Fast and Simple Design of Complex Finite State Machines; Amazon: Seattle, WA, USA, 2018; p. 185. [Google Scholar]
  9. Micheli, G.D. Synthesis and Optimization of Digital Circuits; McGraw-Hill: Cambridge, MA, USA, 1994. [Google Scholar]
  10. Islam, M.M.; Hossain, M.S.; Shahjalal, M.D.; Hasan, M.K.; Jang, Y.M. Area-time efficient hardware implementation of modular multiplication for elliptic curve cryptography. IEEE Access 2020, 8, 73898–73906. [Google Scholar] [CrossRef]
  11. Barkalov, A.; Titarenko, L.; Krzywicki, K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics 2021, 10, 1174. [Google Scholar] [CrossRef]
  12. Barkalov, O.; Titarenko, L.; Mielcarek, K. Hardware reduction for LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2018, 28, 595–607. [Google Scholar] [CrossRef]
  13. Wisniewski, R. Design of Petri Net-Based Cyber-Physical Systems Oriented on the Implementation in Field Programmable Gate Arrays. Energies 2021, 14, 7054. [Google Scholar] [CrossRef]
  14. Klimowicz, A.; Salauyou, V. State Merging and Splitting Strategies for Finite State Machines Implemented in FPGA. Appl. Sci. 2022, 12, 8134. [Google Scholar] [CrossRef]
  15. Conti, V.; Rundo, L.; Billeci, G.D.; Militello, C.; Vitabile, S. Energy Efficiency Evaluation of Dynamic Partial Reconfiguration in Field Programmable Gate Arrays: An Experimental Case Study. Energies 2018, 11, 739. [Google Scholar] [CrossRef]
  16. Barkalov, A.; Titarenko, L.; Krzywicki, K. Improving Hardware in LUT-Based Mealy FSMs. Appl. Sci. 2022, 12, 8065. [Google Scholar] [CrossRef]
  17. Milik, A.; Kubica, M.; Kania, D. Reconfigurable Logic Controller—Direct FPGA Synthesis Approach. Appl. Sci. 2021, 11, 8515. [Google Scholar] [CrossRef]
  18. Kubica, M.; Kania, D.; Kulisz, J. A technology mapping of fsms based on a graph of excitations and outputs. IEEE Access 2019, 7, 16123–16131. [Google Scholar] [CrossRef]
  19. AMD Xilinx FPGAs. Available online: https://www.xilinx.com/products/silicon-devices/fpga.html (accessed on 25 May 2022).
  20. Kubica, M.; Opara, A.; Kania, D. Logic Synthesis Strategy Oriented to Low Power Optimization. Appl. Sci. 2021, 11, 8797. [Google Scholar] [CrossRef]
  21. Skliarova, D.; Sklyarov, V.A.; Sudnitson, A. Design of FPGA-Based Circuits Using Hierarchical Finite State Machines; TUT Press: Vancouver, BC, Canada, 2012. [Google Scholar]
  22. Trimberger, S.M. Field-Programmable Gate Array Technology; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  23. Skliarova, I. A Survey of Network-Based Hardware Accelerators. Electronics 2022, 11, 1029. [Google Scholar] [CrossRef]
  24. Mishchenko, A.; Brayton, R.; Jiang, J.H.; Jang, S. Scalable don’t-care-based logic optimization and resynthesis. ACM Trans. Reconfig. Technol. Syst. (TRETS) 2011, 4, 1–23. [Google Scholar] [CrossRef]
  25. El-Maleh, A.H. A Probabilistic Tabu Search State Assignment Algorithm for Area and Power Optimization of Sequential Circuits. Arab. J. Sci. Eng. 2020, 45, 6273–6285. [Google Scholar] [CrossRef]
  26. Feng, W.; Greene, J.; Mishchenko, A. Improving FPGA performance with a S44 LUT structure. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, 25–27 February 2018; pp. 61–66. [Google Scholar]
  27. Jozwiak, L.; Slusarczyk, A.; Chojnacki, A. Fast and compact sequential circuits for the FPGA-based reconfigurable systems. J. Syst. Archit. 2003, 49, 227–246. [Google Scholar] [CrossRef]
  28. Senhadji-Navarro, R.; Garcia-Vargas, I. Mapping Arbitrary Logic Functions onto Carry Chains in FPGAs. Electronics 2022, 11, 27. [Google Scholar] [CrossRef]
  29. Kubica, M.; Opara, A.; Kania, D. Technology Maping for LUT-Based FPGA; Springer: Berlin, Germany, 2021; p. 208. [Google Scholar]
  30. Barkalov, A.; Titarenko, L.; Mielcarek, K.; Chmielewski, S. Logic Synthesis for FPGA-Based Control Units—Structural Decomposition in Logic Design; Vol. 636 of Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
  31. Salauyou, V.; Ostapczuk, M. State Assignment of Finite-State Machines by Using the Values of Output Variables. In Theory and Applications of Dependable Computer Systems. DepCoS-RELCOMEX 2020. Advances in Intelligent Systems and Computing; Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J., Eds.; Springer: Cham, Switzerland, 2020; Volume 1173, pp. 543–553. [Google Scholar]
  32. Park, J.; Yoo, H. Area-efficient fault tolerance encoding for Finite State Machines. Electronics 2020, 9, 1110. [Google Scholar] [CrossRef]
  33. Baranov, S. From Algorithm to Digital System: HSL and RTL tool Sinthagate in Digital System Design; Amazon: Seattle, WA, USA, 2020; p. 76. [Google Scholar]
  34. Chapman, K. Multiplexer Design Techniques for Data-Path Performance with Minimized Routing Resources. Xilinx All Programmable. V. 1.2. 31 October 2014, pp. 1–32. Available online: https://docs.xilinx.com/v/u/en-US/xapp522-mux-design-techniques (accessed on 25 September 2022).
  35. Scholl, C. Functional Decomposition with Application to FPGA Synthesis; Kluwer Academic Publishers: Boston, MA, USA, 2001. [Google Scholar]
  36. Khatri, S.P.; Gulati, K. Advanced Techniques in Logic Synthesis, Optimizations and Applications; Springer: New York, NY, USA, 2011. [Google Scholar]
  37. Barkalov, A.; Titarenko, L.; Mielcarek, K. Improving characteristics of LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2020, 30, 745–759. [Google Scholar]
  38. Tiwari, A.; Tomko, K.A. Saving power by mapping finite-state machines into embedded memory blocks in FPGAs. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, Paris, France, 16–20 February 2004; IEEE: Piscataway, NJ, USA, 2004; Volume 2, pp. 916–921. [Google Scholar]
  39. Vivado Design Suite User Guide: Synthesis; UG901 (v2019.1). Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug901-vivado-synthesis.pdf (accessed on 29 August 2022).
  40. Kubica, M.; Kania, D. Technology mapping oriented to adaptive logic modules. Bull. Pol. Acad. Sci. 2019, 67, 947–956. [Google Scholar]
  41. Mishchenko, A.; Chattarejee, S.; Brayton, R. Improvements to technology mapping for LUT-based FPGAs. IEEE Trans. CAD 2006, 27, 240–253. [Google Scholar]
  42. McElvain, K. LGSynth93 Benchmark; Mentor Graphics: Wilsonville, OR, USA, 1993. [Google Scholar]
  43. VC709 Evaluation Board for the Virtex-7 FPGA User Guide; UG887 (v1.6); Xilinx, Inc.: San Jose, CA, USA, 2019.
  44. El-Maleh, A.H. A probabilistic pairwise swap search state assignment algorithm for sequential circuit optimization. Integration 2017, 56, 32–43. [Google Scholar] [CrossRef]
  45. Ruiz-Rosero, J.; Ramirez-Gonzalez, G.; Khanna, R. Field-Programmable Gate Array Applications—A Scientometric Review. Computation 2019, 7, 63. [Google Scholar] [CrossRef] [Green Version]
  46. Altera. Cyclone IV Device Handbook. Available online: http://www.altera.com/literature/hb/cyclone-iv/cyclone4-handbook.pdf (accessed on 29 August 2022).
Figure 1. Structural diagram of P Mealy FSM.
Figure 1. Structural diagram of P Mealy FSM.
Electronics 11 03089 g001
Figure 2. Structural diagram of P T Mealy FSM.
Figure 2. Structural diagram of P T Mealy FSM.
Electronics 11 03089 g002
Figure 3. Generalized diagram of P T Mealy FSM.
Figure 3. Generalized diagram of P T Mealy FSM.
Electronics 11 03089 g003
Figure 4. Generalized diagram of P 2 C Mealy FSM.
Figure 4. Generalized diagram of P 2 C Mealy FSM.
Electronics 11 03089 g004
Figure 5. State transition graph of Mealy FSM S 1 .
Figure 5. State transition graph of Mealy FSM S 1 .
Electronics 11 03089 g005
Figure 6. Outcome of state assignment for Mealy FSM S 1 .
Figure 6. Outcome of state assignment for Mealy FSM S 1 .
Electronics 11 03089 g006
Figure 7. Partial state codes for Mealy FSM S 1 .
Figure 7. Partial state codes for Mealy FSM S 1 .
Electronics 11 03089 g007
Figure 8. Karnaugh map for SBF V ( T ) .
Figure 8. Karnaugh map for SBF V ( T ) .
Electronics 11 03089 g008
Figure 9. Block-diagram of state redistribution algorithm.
Figure 9. Block-diagram of state redistribution algorithm.
Electronics 11 03089 g009
Table 1. State transition table of Mealy FSM S 1 .
Table 1. State transition table of Mealy FSM S 1 .
a m a s X h Y h h
a 1 a 2 x 1 y 1 y 5 1
a 3 x 1 ¯ y 4 2
a 2 a 5 x 2 y 1 y 2 3
a 6 x 2 ¯ x 5 y 6 4
a 3 x 2 ¯ x 5 ¯ y 5 y 8 5
a 3 a 6 x 1 y 5 y 8 6
a 1 x 1 ¯ y 4 7
a 4 a 1 1 y 1 y 4 8
a 5 a 5 x 5 y 2 9
a 2 x 5 ¯ x 6 y 1 y 6 10
a 7 x 5 ¯ x 6 ¯ y 8 11
a 6 a 4 x 3 y 1 y 3 12
a 5 x 3 ¯ y 7 13
a 7 a 4 x 2 y 2 y 6 14
a 8 x 2 ¯ y 5 15
a 8 a 7 x 3 y 3 16
a 4 x 3 ¯ x 7 y 1 y 7 17
a 9 x 3 ¯ x 7 ¯ y 5 y 8 18
a 9 a 4 x 5 y 3 y 7 19
a 1 x 5 ¯ x 7 y 1 20
a 8 x 5 ¯ x 7 ¯ y 7 y 8 21
Table 2. Table of C o r e T for Mealy FSM S 1 .
Table 2. Table of C o r e T for Mealy FSM S 1 .
a m K ( a m ) a s K ( a s ) X h 0 Y h 0 D h 0 h
a 1 0000 a 2 0100 x 1 y 1 y 5 D 2 1
a 3 0001 x 1 ¯ y 4 D 4 2
a 3 0001 a 6 1000 x 1 y 5 y 8 D 1 3
a 1 0000 x 1 ¯ y 4 4
a 4 0010 a 1 00001 y 1 y 4 5
Table 3. Table of L U T e r 1 .
Table 3. Table of L U T e r 1 .
a m C ( a m ) a s K ( a s ) X h 1 Y h 1 D h 1 h
a 2 01 a 5 0101 x 2 y 1 y 2 D 2 D 4 1
a 6 1000 x 2 ¯ x 5 y 6 D 1 2
a 3 0001 x 2 ¯ x 5 ¯ y 5 y 8 D 4 3
a 5 10 a 5 0101 x 5 y 2 D 2 D 4 4
a 2 0100 x 5 ¯ x 6 y 1 y 6 D 2 5
a 7 0110 x 5 ¯ x 6 ¯ y 8 D 2 D 3 6
a 7 11 a 4 0010 x 2 y 2 y 6 D 3 7
a 8 1001 x 2 ¯ y 5 D 1 D 4 8
Table 4. Table of L U T e r 2 .
Table 4. Table of L U T e r 2 .
a m C ( a m ) a s K ( a s ) X h 2 Y h 2 D h 2 h
a 6 01 a 4 0010 x 3 y 1 y 3 D 3 1
a 5 0101 x 3 ¯ y 7 D 2 D 4 2
a 8 10 a 7 0110 x 3 y 3 D 2 D 3 3
a 4 0010 x 3 ¯ x 7 y 1 y 7 D 3 4
a 9 1010 x 3 ¯ x 7 ¯ y 5 y 8 D 1 D 3 5
a 9 11 a 4 0010 x 5 y 3 y 7 D 2 6
a 1 0000 x 5 ¯ x 7 y 1 7
a 8 1001 x 5 ¯ x 7 ¯ y 7 y 8 D 1 D 4 8
Table 5. Table of L U T e r T Y .
Table 5. Table of L U T e r T Y .
f j CoreT CoreV
12
D 1 111
D 2 111
D 3 011
D 4 111
y 1 111
y 2 010
y 3 001
y 4 100
y 5 111
y 6 010
y 7 001
y 8 111
Table 6. Table of L U T e r V .
Table 6. Table of L U T e r V .
a m K ( a m ) C ( a m ) V m
a 2 01000100 v 2
a 5 01011000 v 1
a 6 10000001 v 4
a 7 01101100 v 1 v 2
a 8 10010010 v 3
a 9 10100011 v 3 v 4
Table 7. Redistribution process for FSM S 1 .
Table 7. Redistribution process for FSM S 1 .
k = 1 k = 2
A MB I ( a m ) γ 1 j = 1 j = 2 A MB I ( a m ) γ 2 j = 1 j = 2
a 1 1 4 a 1 1 3
a 3 1 5 a 3 1 4
a 4 02 a 4 02
a 6 03 a 6 11
a 7 11
a m a 7 a m a 6
A 1 a 2 a 5 a 7 A 2 a 8 a 9 a 6
Table 8. Characteristics of benchmarks [42].
Table 8. Characteristics of benchmarks [42].
BenchmarkLN R MB + L M / R MB HClass
bbara42812/4601
bbsse771226/5561
bbtas2269/4240
beecount34710/4281
cse771232/5911
dk1435826/5561
dk1535817/5321
dk1623975/71081
dk1723616/4320
dk2712510/4140
dk51213624/5150
donfile21724/5961
ex19191680/71382
ex222725/5721
ex322614/4360
ex4691118/5211
ex522616/4320
ex658914/4341
ex7221217/5361
keyb771222/51701
kirkman1261848/63702
lion2155/3110
lion921611/4250
mark15161022/5221
mc3568/3100
modulo1211512/4240
opus561018/5221
planet7191486/71152
planet17191486/71152
pma881449/6732
s1871454/61062
s148881915112/72512
s149481915118/72502
s1a861586/71072
s2081121737/61532
s2741811/4341
s386771223/5641
s42019227137/81374
s51019727172/8774
s841815/4201
s82018192578/72324
s83218192576/72454
sand1191888/71843
shiftreg11516/4160
sse771226/5561
styr9101667/71662
tma791363/6442
Table 9. Experimental results (the LUT counts).
Table 9. Experimental results (the LUT counts).
BenchmarkAutoOne-HotJEDI P T Our ApproachSet
bbara21211415141
bbsse40443129241
bbtas777770
beecount22221715131
cse47734339361
dk1419301315131
dk15181915991
dk1617361412121
dk177147770
dk27465770
dk512111110990
donfile33332621181
ex179836251512
ex21111101091
ex31111111090
ex421191816131
ex511111110100
ex629412725201
ex7676661
keyb50684744391
kirkman54705147392
lion474440
lion98137770
mark128282523201
mc7107770
modulo12888880
opus33332728241
planet1381389587872
planet11381389587872
pma1021029486772
s1731076965592
s14881321391161001002
s14941341401181021022
s1a57895149492
s20823422120172
s2710221010101
s38633462925201
s42029502827274
s51067675148484
s82013131314141
s8321061008676764
s84098978072654
sand1431431251121123
shiftreg373550
sse40443733291
styr1021299082822
tma52464640342
Total20992395178016211533
Percentage, %136.92156.23116.11105.74100.00
Table 10. Experimental results (the minimum cycle time, nanoseconds).
Table 10. Experimental results (the minimum cycle time, nanoseconds).
BenchmarkAutoOne-HotJEDI P T Our ApproachSet
bbara8.8118.8118.3528.3947.6011
bbsse10.0969.6429.2138.7637.9241
bbtas8.4978.4978.4518.4978.4970
beecount9.6059.6058.9418.5687.7401
cse10.5589.8409.3438.5707.7641
dk148.8219.3958.7628.9648.0701
dk158.7978.9988.7358.8908.0091
dk169.4919.3208.6728.3277.5391
dk178.6179.5878.6178.6178.6020
dk278.3258.4248.3698.3258.3250
dk5128.5668.5668.4778.5668.5660
donfile9.0339.0348.5097.9167.6281
ex110.42510.9559.4548.4968.4962
ex28.6358.6358.5968.5667.7381
ex38.7318.7318.7078.7318.7310
ex49.2149.3158.8748.7457.9021
ex59.1479.1479.1199.1479.1470
ex69.5649.7729.3308.7017.8631
ex78.5988.5788.5848.5827.7511
keyb10.12110.6999.6669.0638.1741
kirkman10.97110.39210.2809.6218.3002
lion8.5398.5018.5418.5958.5950
lion98.4708.9988.4448.4278.4270
mark19.8259.8259.3438.9428.0631
mc8.6888.7198.6828.6888.6880
modulo128.3028.3028.2998.3028.3020
opus9.6849.6849.2759.2908.3531
planet11.26411.2649.0738.8978.8972
planet111.26411.2649.0738.8978.8972
pma10.63410.6349.6819.2157.9632
s110.62311.15410.1569.6698.3082
s148811.01311.37210.1559.1149.1142
s149410.48710.6549.8789.1639.1632
s1a10.3139.4629.7049.3859.3852
s2089.5039.4349.3618.8597.6842
s278.6728.8628.6628.6717.8321
s3869.6769.4949.3119.2058.2981
s4209.8649.7809.7559.6199.6194
s5109.7429.7429.1558.8898.8894
s82010.69110.6419.7759.3179.3171
s83210.97510.6389.8669.2979.2974
s8409.1959.2289.1589.2488.3214
sand12.39012.39011.6529.8959.8953
shiftreg8.3027.2657.0918.3028.3020
sse10.0969.6429.4559.0028.5971
styr11.06711.49710.6669.3989.3982
tma9.83110.4959.8219.2477.9742
Total453.73454.88431.08417.58395.94
Percentage, %114.60114.89108.88105.46100.00
Table 11. Experimental results (the maximum operating frequency, MHz).
Table 11. Experimental results (the maximum operating frequency, MHz).
BenchmarkAutoOne-HotJEDI P T Our ApproachSet
BenchmarkAutoOne-HotJEDIPT FSMPE FSMSet
bbara113.496113.496119.727119.139131.5561
bbsse99.049103.713108.539114.116126.1991
bbtas117.687117.687118.336117.687117.6870
beecount104.112104.112111.839116.720129.1991
cse94.713101.626107.030116.680128.8071
dk14113.364106.439114.134111.556123.9081
dk15113.675111.137114.487112.485124.8621
dk16105.362107.294115.316120.096132.6471
dk17116.049104.308116.049116.049116.2490
dk27120.122118.709119.494120.122120.1220
dk512116.740116.740117.963116.740116.7400
donfile110.706110.696117.517126.323131.0931
ex195.92291.281105.777117.700117.7002
ex2115.808115.808116.340116.744129.2341
ex3114.536114.536114.846114.536114.5360
ex4108.530107.352112.690114.356126.5521
ex5109.327109.327109.661109.327109.3270
ex6104.556102.333107.183114.930127.1861
ex7116.306116.576116.495116.526129.0111
keyb98.80693.466103.453110.340122.3421
kirkman91.14896.23297.272103.938120.4762
lion117.110117.634117.083116.353116.3530
lion9118.065111.136118.421118.668118.6680
mark1101.781101.781107.032111.834124.0201
mc115.102114.694115.174115.102115.1020
modulo12120.454120.454120.498120.454120.4540
opus103.265103.265107.818107.642119.7171
planet88.77788.777110.222112.395112.3952
planet188.77788.777110.222112.395112.3952
pma94.03994.039103.293108.524125.5872
s194.13489.65398.465103.426120.3622
s148890.80087.93498.472109.727109.7272
s149495.35793.861101.236109.135109.1352
s1a96.963105.687103.048106.558106.5582
s208105.231106.000106.825112.874130.1362
s27115.314112.842115.449115.324127.6761
s386103.348105.329107.401108.642120.5121
s420101.378102.249102.514103.961103.9614
s510102.648102.648109.226112.493112.4934
s82093.53793.975102.300107.336107.3361
s83291.11794.001101.354107.563107.5634
s840108.755108.364109.196108.133120.1844
sand80.71180.71185.821101.059101.0593
shiftreg120.454137.645141.028120.454120.4540
sse99.049103.713105.760111.085116.3151
styr90.35986.97993.754106.411106.4112
tma101.71995.284101.819108.141125.4132
Total4918.264910.305157.585301.805605.42
Percentage, %87.7487.6092.0194.58100.00
Table 12. Experimental results (the consumed power, Watts).
Table 12. Experimental results (the consumed power, Watts).
BenchmarkAutoOne-HotJEDI P T Our ApproachSet
bbara0.9610.9610.8800.8180.8411
bbsse2.6511.6372.1442.0282.0721
bbtas0.9000.9000.9000.9000.9000
beecount2.0112.0111.4011.3891.3921
cse1.3891.4501.3221.3061.3121
dk143.3393.7103.3323.3013.3211
dk151.7832.2851.7791.7121.7281
dk163.3343.1092.8792.8012.8011
dk172.2682.3022.2582.2862.2860
dk271.5241.2101.5141.5391.5390
dk5121.8521.8521.7011.7431.7430
donfile1.0761.0760.9700.9120.9341
ex14.5643.4302.8042.6122.6122
ex20.7350.7530.7090.6980.7121
ex30.7580.7580.7580.7580.7580
ex41.9801.6591.6051.5891.6051
ex50.7540.7540.7520.7650.7650
ex62.6754.2562.6482.6132.6611
ex71.3591.5481.3611.3421.3921
keyb1.5241.5021.5061.4921.5011
kirkman2.2042.3551.9501.8461.8522
lion0.9090.9960.9140.9230.9230
lion91.1001.3371.0951.1021.1020
mark11.8511.8511.6331.6211.6431
mc0.8270.9410.8230.8230.8230
modulo120.9150.9150.9190.9210.9210
opus1.7501.7501.6891.6781.7141
planet4.5534.5532.8872.7142.7142
planet14.5534.5532.8872.7142.7142
pma1.8181.8181.7011.6861.7172
s13.1333.5782.9662.8952.9182
s14884.4304.5443.9963.8013.8012
s14943.5273.6263.4303.3963.3962
s1a1.7702.4581.6561.6021.6022
s2081.8583.3111.7401.6941.7262
s271.1482.3421.1571.1141.1431
s3861.6821.8241.5521.5011.5431
s4201.9603.4431.9091.8121.8124
s5102.1662.1661.7141.6431.6434
s8201.1281.1971.1241.1121.1121
s8322.6622.4092.0711.9851.9854
s8402.7042.6952.4362.2432.3154
sand1.6401.6401.4791.4011.4013
shiftreg0.8790.9590.8680.8790.8790
sse1.6511.7271.5201.5031.5211
styr4.5065.2333.6493.5983.5982
tma2.0201.7451.7521.7111.7632
Total96.78103.1384.7482.5283.162
Percentage, %116.38124.02101.9099.24100.00
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Barkalov, A.; Titarenko, L.; Krzywicki, K. Using a Double-Core Structure to Reduce the LUT Count in FPGA-Based Mealy FSMs. Electronics 2022, 11, 3089. https://doi.org/10.3390/electronics11193089

AMA Style

Barkalov A, Titarenko L, Krzywicki K. Using a Double-Core Structure to Reduce the LUT Count in FPGA-Based Mealy FSMs. Electronics. 2022; 11(19):3089. https://doi.org/10.3390/electronics11193089

Chicago/Turabian Style

Barkalov, Alexander, Larysa Titarenko, and Kazimierz Krzywicki. 2022. "Using a Double-Core Structure to Reduce the LUT Count in FPGA-Based Mealy FSMs" Electronics 11, no. 19: 3089. https://doi.org/10.3390/electronics11193089

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop