Next Article in Journal
Cross-Entropy as a Metric for the Robustness of Drone Swarms
Next Article in Special Issue
Energy Based Logic Mining Analysis with Hopfield Neural Network for Recruitment Evaluation
Previous Article in Journal
Rhythm Analysis during Cardiopulmonary Resuscitation Using Convolutional Neural Networks
Previous Article in Special Issue
Machine Learning Based Automated Segmentation and Hybrid Feature Analysis for Diabetic Retinopathy Classification Using Fundus Image
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Amazon Employees Resources Access Data Extraction via Clonal Selection Algorithm and Logic Mining Approach

by
Nur Ezlin Zamri
1,
Mohd. Asyraf Mansor
1,*,
Mohd Shareduwan Mohd Kasihmuddin
2,
Alyaa Alway
1,
Siti Zulaikha Mohd Jamaludin
2 and
Shehab Abdulhabib Alzaeemi
2
1
School of Distance Education, Universiti Sains Malaysia, Penang 11800, Malaysia
2
School of Mathematical Sciences, Universiti Sains Malaysia, Penang 11800, Malaysia
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(6), 596; https://doi.org/10.3390/e22060596
Submission received: 23 March 2020 / Revised: 16 April 2020 / Accepted: 20 April 2020 / Published: 27 May 2020
(This article belongs to the Special Issue Information-Theoretic Data Mining)

Abstract

:
Amazon.com Inc. seeks alternative ways to improve manual transactions system of granting employees resources access in the field of data science. The work constructs a modified Artificial Neural Network (ANN) by incorporating a Discrete Hopfield Neural Network (DHNN) and Clonal Selection Algorithm (CSA) with 3-Satisfiability (3-SAT) logic to initiate an Artificial Intelligence (AI) model that executes optimization tasks for industrial data. The selection of 3-SAT logic is vital in data mining to represent entries of Amazon Employees Resources Access (AERA) via information theory. The proposed model employs CSA to improve the learning phase of DHNN by capitalizing features of CSA such as hypermutation and cloning process. This resulting the formation of the proposed model, as an alternative machine learning model to identify factors that should be prioritized in the approval of employees resources applications. Subsequently, reverse analysis method (SATRA) is integrated into our proposed model to extract the relationship of AERA entries based on logical representation. The study will be presented by implementing simulated, benchmark and AERA data sets with multiple performance evaluation metrics. Based on the findings, the proposed model outperformed the other existing methods in AERA data extraction.

1. Introduction

Amazon.com Inc. operates internationally by offering consumers products and subscriptions through more than 10 owned retail websites and physical stores in 600 locations across the United States of America (US). As reported in 2019, the company has increasing numbers of employees, and more than 600,000 employees worldwide [1]. Thus, with a large number of employees, there is always a risk of highly complicated employees and resources situations [2]. Within any company, new employees require a variety access of systems, portals or appliances related on the role, designation or unit of the employee. Technology companies like Amazon.com Inc. provide various types of resources; from computing to storage resources, accessible by their employees to be utilized optimally [3]. However, most of the time, employees encounter some complications prior to fulfilling their daily tasks. For example, computing resources opt to have Wi-Fi connection or they are unable to log in into Amazon.com Inc. human resources portal. Commonly, new resources applications are being processed and reviewed by distinct human administrators. It is worth mentioning that the downside of this common practice involves a chain of human involvement that could lead to higher cost of resource maintenance and could be time-consuming. Therefore, Amazon.com Inc. made public their historical data from 2010–2011 of Amazon Employees Resources Access (AERA) data set, provisioned by Ken Montanez from Information Security of Amazon.com Inc. in partnership with Kaggle. Their motive is to seek alternative models that will prioritize the needs of employees and minimize manual resources access applications. A study by [4] proposed a forecasting model by using random forest (RF), logistic regression (LR) and gradient boosting (GB). However, the suggested approach was restricted to statistical linear classifiers and required a preprocessing step due to the imbalanced entries of AERA. One may question what makes this experiment significant from the work by [4]? In this paper, the main objective is to propose an alternative model in the field of data science by incorporating Artificial Neural Networks (ANNs) with Metaheuristic and Satisfiability representation (SAT). The proposed model act as a platform of knowledge extraction to handle big data which could benefit other big companies like Walmart Inc., Apple Inc., Samsung Electronics etc. in resources management.
ANN comprises parallel and nonparallel computing networks that are inspired by the mechanism of human biological brain [5]. ANN has several comprehensive architectures of feed-forward or feedback networks. Artificial Intelligence (AI) practitioners utilized ANN as a platform in applications such as entity classification problems [6], conducting analysis [7,8], pattern recognition [9,10], clustering problems [11,12] and circuits [13,14]. Nonetheless, another popular network of feedback ANN is the Hopfield Neural Network (HNN), which was formulated by [15] to solve optimization tasks. The extensive structure of HNN comprises energy function and associative property of content addressable memory (CAM). The work of [16] utilized HNN for transmitting binary amplitude modulated signals based on the potential energy function yielding lower probability of error. In addition, the work of [17] emphasized HNN as one of the most studied attractor-memory models due to the feature of useful Content Addressable Memory (CAM) for an optimization model. Note that HNN can be split into continuous HNN (CHNN) and discrete HNN (DHNN). The structure of DHNN consists of input and output neurons that store bipolar { 1 , 1 } or binary { 1 , 0 } pattern [18]. In addition, DHNN utilizes the Lyapunov energy function to determine degree of convergence of the solution [19]. This paper incorporates the Wan Abdullah (WA) method of finding the synaptic weights by comparing the Lyapunov energy function with the cost function [20]. The core impetus of the presented works is the relevancy of utilizing DHNN as a comprehensive model of AI as a platform to solve optimization tasks. Although DHNN is a “black box” model, the best way to observe DHNN behaviour is by implementing a systematic symbolic rule during the learning phase and a retrieved phase equation. Hence, one of the alternative ways to represent information theory is by the concept of satisfiability.
Satisfiability representation (SAT) is a logical and mathematical knowledge representation that plays a significant role in AI. SAT is utilized in various applications and areas such as quantum chemistry [21], approximation model [22], classification [23], chaos computing [24] and fault detection [25]. The SAT structure consists of clauses comprises of literals or variables. Why is SAT needed in DHNN? SAT is essential to provide symbolic instruction in attempt to represent the output of DHNN. Pioneer work by [26] showed the adaptability of Horn-SAT to represent information in executing the DHNN model that was improved later by [27]. The work improvised the existing model in neuro-symbolic integration model that gained more than 90% of global minimum energy. However, the restricted component in using Horn formula is limited in representing real-life data sets, which indicate not all real-life problem can be formulated in Horn-SAT [28]. Therefore, several researchers further extend the fundamental of Horn-SAT by proposing of DHNN model with different k-Satisfiability (k-SAT) logical representation [29,30,31]. These works emphasized on utilizing k-SAT, Maximum k-SAT (MAXk-SAT) and Maximum 2-SAT (MAX2-SAT) to investigate the ability of DHNN to process k-SAT patterns. In another development, data mining is a process of recognizing sequences or patterns in real-life data sets that involve various platforms. The difference between data mining and logic mining is that the logical rule mining utilizes logic to convey the information to the end user. Contingent upon that, the earliest logic mining method, the reverse analysis (RA) method, was introduced by [32] and it accommodates the combination of RA and logic programming in DHNN to deduce the pattern and relationship of the real-life data sets. Subsequently, [33] utilized previous work of building a knowledge extraction tool by forming k-Satisfiability-based Reverse Analysis (k-SATRA). k-SATRA carries an important role in logic mining to display the true behaviour or pattern of a real-life data set by extracting the optimum logic that represents the relationship of the attributes. The extracted logic will represent information aligned with the specifics classification tasks. An interesting application of k-SATRA is reported by the work of [34], which investigates students’ performance in identifying related factors of underachievement students. The work entrenched several real-life data sets and obtained higher accuracy than two other existing educational data mining methods. Another development of utilizing k-SATRA was by [35] and [36] which exhibited the ability 2-SATRA to extract key findings of online games and football matches. The common denominator of these works exhibits the practicability of k-SATRA in extracting knowledge from a real-life data set. The extracted knowledge identifies relationships of attributes that affect the final outcome. However, there are no current works creating a platform to bridge logic and data mining methods with specific optimization tasks such as those encountered by Amazon.com Inc. of detecting which factor should be prioritized in order to grant or revoke employees resources applications. The incorporation of metaheuristics like the Clonal Selection Algorithm (CSA) in the training phase would capitalize better on the learning environment for an optimal optimization model.
The Metaheuristics Algorithm is a nonderivative method that searches near optimal solutions with specific constraints. [37] presented various applications of metaheuristics to find high-quality solutions to increasing number of ill-defined and complex real-world problems. Metaheuristics garnered much attention, especially from ANN practitioners, because metaheuristics provides a better learning mechanism of ANN networks by specifying the searching space of solutions and focusing on gradual solution improvement [38]. Conventionally, DHNN deployed the primitive learning rule of exhaustive search (ES), a trial and error mechanism to find solutions [39]. ES increases the probability of overfitting [40] and generates less variation of solutions [41]. CSA is an evolutionary algorithm, inspired by the natural phenomenon of the biological immune system, which defends the body against external microorganisms. [42] reviewed recent works by researchers implementing CSA into their proposed network to deal with constraint optimization tasks, such as pattern recognition [43], scheduling [44], fault detection [45] and dynamic optimization [46]. Mechanisms of CSA gives the inspiration of specific cells to recognize specific antigens which are later selected to proliferate. This resulted in a learning algorithm of evolving candidate solutions by selection, cloning and somatic hypermutation procedures, which established variation of solutions. Conjointly, the mechanism of CSA sets a new paradigm of solving optimization tasks. Pioneer work by [47] introduced the affinity-based interaction for modified CSA as a solver with the tabu search technique for the Maximum 3-SAT (MAX3-SAT) problem. The suggested model yielded quality solutions. Therefore, to predict the resources access applications for future sets of employees of AERA, this paper capitalizes on fundamental DHNN by incorporating CSA in the learning phase to overcome conventional metaheuristic drawbacks. The proposed model sets apart from previous literature due to different role of CSA to facilitate the learning phase of DHNN for 3-SAT logic, resulting a single intelligent unit that incorporates real-life data set to help Amazon.com Inc. resources access management.
To our best knowledge, no current work has proposed the incorporation of DHNN with CSA for 3-SATRA logic-mining methods. An optimal model may result in better management from Amazon.com Inc. in providing the best care for their employees. Subsequently, the contributions of this work are stated as follows: (1) To transform AERA into a 3-SAT logical representation to best represent the relationship of AERA. (2) To construct a modified DHNN model with CSA to enhance the learning phase of DHNN. (3) To utilize the 3-SATRA method into our proposed model as an alternative method to extract information from AERA in the form of logical representation. (4) To demonstrate the capability of our proposed model by conducting a simulated data set, benchmark data sets and the AERA data set in comparison with other existing methods. The comparison will be also evaluated by using appropriate performance evaluation metrics. The findings of this paper displayed the competency of our proposed model outperformed other existing methods for all type of data sets. Figure 1 illustrates the implementation of our contribution in this paper.

2. Boolean Satisfiability

Boolean satisfiability logic (SAT) represents a task in determining truth assignments that makes the logical rule satisfiable. SAT is a nondeterministic polynomial time, NP-complete problem where SAT can be solved in polynomial time by a nondeterministics Turing machine [48]. In this paper, SAT is represented in a conjunctive normal form (CNF) and composed of three significant elements [49]:
  • A group of m variables: a 1 a 2 a m where a i { 1 , 1 } .
  • A group of literals: A literal is a variable ( a 1 ) or a negation of a variable ( a ¯ 1 ) .
  • A group of n clauses: A 1 A 2 A n .
The above elements can be explicitly represented in the following Equation (1):
φ 3 SAT = i = 1 n A i   where   A i = ( a i , b i , c i )
This paper utilized 3-Satisfiability (3-SAT) logical rule, φ 3 SAT , in each clause of which only exist three variables. Equation (2) governed an example of φ 3 SAT . Note that φ 3 SAT represents the objective or outcome of the logical rule.
φ 3 SAT = ( P ¯ Q R ) ( S T ¯ U ¯ ) ( V ¯ W X )
Table 1 shows an example of cases for the φ 3 SAT logical rule. The outcomes of each case are known by substituting the values of { 1 , 1 } (neuron states) into Equation (2). For instance, case 1 is satisfiable since each clause gives a truth value. Besides that, case 3 is in full consistency since all literals give a truth value. Additionally, the work by [49] states that the algorithm needs to learn more inconsistent interpretations to obtain the satisfied φ 3 SAT , which we described as the checking clause satisfaction process. To undergo this process, a suitable metaheuristic algorithm is needed to attain φ 3 SAT = 1 [47]. In this paper, the φ 3 SAT logical rule is employed in our proposed model to govern our model and represent each entry of AERA.

3. 3-Satisfiability in Discrete Hopfield Neural Network

The Discrete Hopfield Neural Network (DHNN) is another variant of the Hopfield Neural Network that is commonly utilized to solve practical optimization problems [50]. DHNN consists of interconnected neurons with no hidden layer. Each neuron in DHNN is bipolar S i { 1 , 1 } , i , which exemplifies the interpretation of the defined problem. Several properties of DHNN include associative memory, fault tolerance and energy minimization as the neuron state changes. There are two types of neuron updates in DHNN: asynchronous and synchronous update. We limit our discussion to asynchronous update because we only consider one neuron state at the time. Each neuron spin resembles an Ising spin variable model [51], which contributes to updating neurons in each cycle. The general updating rule of the general DHNN is given as follows:
S i = {   1 ,   if   j N W i j S j ρ i 1   ,   otherwise
where W i j and ρ i are the synaptic weight and threshold of the contraints. It is worth mentioning that we consider ρ i = 0 to ensure the energy of DHNN decreases uniformly [52]. W i j in each neurons connection is formally defined in a matrix of W i j ( 2 ) = [ W i j ( 2 ) ] N × N with the threshold of the neuron updates given by [ ρ i ] n   ×   1 = [ ρ 1 , ρ 2 , ρ 3 , , ρ n ] T . Note that DHNN has no self-looping W i i ( 2 ) = W j j ( 2 ) = 0 for all neurons and the connection is symmetrical W i j ( 2 ) = W j i ( 2 ) which results in a matrix with zeros diagonal. The updating rule of general DHNN is important to ensure the neuron state will converge to the optimal solution. In this section, we capitalize the logical rule of φ 3 SAT into the structure of DHNN by defining the cost function of the network. φ 3 SAT can be implemented in DHNN by minimizing the cost function E φ 3 SAT :
E φ 3 SAT = i = 1 N C j = 1 N D i j
where N C and N are the number of clauses and number of literals accordingly. D i j is defined as follows:
D i j = { 1 2 ( 1 + S X )   ,   if   X X ¯ 1 2 ( 1 S X )   ,   otherwise
where X is one possible variable in φ 3 SAT . Note that the lowest value of the cost function is E φ 3 SAT = 0 where all the inconsistencies of the φ 3 SAT are minimized. Hence the updating rule or local field of the φ 3 SAT in DHNN is given as follows:
h i ( t ) = k = 1 , k j N j = 1 , j k N W i j k ( 3 ) S k S j + j = 1 , j   i N W i j ( 2 ) S j + W i ( 1 )
S i ( t + 1 ) = { 1   ,   k = 1 , k j N j = 1 , j k N W i j k ( 3 ) S k S j + j = 1 , j   i N W i j ( 2 ) S j + W i ( 1 ) 0 1   ,   k = 1 , k j N j = 1 , j k N W i j k ( 3 ) S k S j + j = 1 , j   i N W i j   ( 2 ) S j + W i ( 1 ) < 0
where W i j k ( 3 ) , W i j ( 2 ) and W i ( 1 ) are synaptic weight for the third, second and first order connection respectively. The threshold for the proposed DHNN is ρ = 0 and can be flexibly defined by the user. According to [53], the final neuron state, S i ( t + 1 ) , can be optimized by the usage of a squashing function such as a Hyperbolic Activation Function (HTAF). Interested readers on this aspect may refer to [49,53,54]. Furthermore, Equations (6) and (7) are vital to ensure the final neuron state always converges to E φ 3 SAT 0 . Theorem 1 explains the behaviour of the synaptic weight with respect to the final state of the neuron.
Theorem 1.
Let N = ( W ,   ρ ) where ρ is the threshold of the model of DHNN. Assuming N operates in asynchronous mode and W is a symmetric matrix with the elements of the diagonal being nonnegative. Then DHNN will always converge to a stable state.
In addition, the Lyapunov energy function L φ 3 SAT that corresponds to the φ 3 SAT rule is given as follows:
L φ 3 SAT = 1 3 i = 1 , i j k N j = 1 , i j k N k = 1 , i j k N W i j k ( 3 ) S i S j S k 1 2 i = 1 , i j N j = 1 , i j N W i j ( 2 ) S i S j i = 1 N W i ( 1 ) S i
The value of L φ 3 SAT indicates the quality of the final state obtained from Equation (8). According to [20], the synaptic weight of the DHNN can be obtained by comparing Equations (4) and (6). The energy value of the φ 3 SAT , L φ 3 SAT min , can be predetermined before the learning phase because the energy value from each clause in φ 3 SAT is always constant. It is worth mentioning that the optimal DHNN always converges to L φ 3 SAT L φ 3 SAT min or | L φ 3 SAT L φ 3 SAT min | , where is the tolerance value of the Lyapunov energy function. In this paper, the information from the data set will be represented in terms of φ 3 SAT and embedded into DHNN. The implementation of φ 3 SAT in DHNN is abbreviated as DHNN-3SAT. One of the main obstacles in implementing DHNN-3SAT is to find a set of S i that corresponds to E φ 3 SAT = 0 . By that standard, optimal learning method is required to effectively minimize E φ 3 SAT .

4. Clonal Selection Algorithm

The learning phase of an ANN can be further improved via metaheuristics to provide more global solutions, a better learning mechanism and to ascertain the convergence of the ANN models [55]. A work proposed by [56] indicated that these algorithms required less execution time to complete the training process. Generally, metaheuristics have two types of searching algorithms, trajectory-based and population-based. The work is focusing on the population-based nature-inspired algorithm of evolutionary algorithms (EA). CSA is a class of Artificial Immune System (AIS) algorithms that is motivated by the natural immune system process that build particular antibodies against antigens. B-cells ( β ) will produce specific antibodies once a new antigen is identified. Through the cloning process, the chosen β will proliferate to form a clone of β and fight against antigens [57]. The cloned β developed into two types of cell, memory cells and plasma cells. Memory cells are recognized as long-lived cells that can react instantly to any illness. As for the plasma cells, they are active and able to secrete specific antibodies for the antigens, but they do not last long.
The findings by Layeb [47] presented modified CSA with the tabu search method to resolve the satisfiability problem. The affinity computation in [47] utilized the adaptive affinity function, which considers the summation of weight with the clauses and complies with the binary vector form of MAX-SAT logical representation. On the other hand, our proposed CSA model complies to bipolar representation of φ 3 SAT as the affinity function being formulated in terms of clause representation that corresponds to E φ 3 SAT = 0 . The operations involved in the CSA mechanism are where β produces a specific antibody to destroy a specific antigen, which signifies the adaptive system of CSA principle. Proliferation, normalization and somatic hypermutation processes ensure a better variation of the β population. This paper implements the CSA mechanism to provide an optimal learning model, where CSA helps to achieve maximum number of satisfied clauses from the affinity or fitness of β . The implementation of CSA in the proposed model (DHNN3-SATCSA) is presented as follows [58]:
Stage 1: Initialization of β
β = 100 (interpretations) were initialized.
β i j = { 1   , r a n d [ 0 , 1 ] 0.5 1   , otherwise
Stage 2: Affinity Evaluation
Compute affinity of all β in the entire population, f β i . The f β i examines the number of clauses satisfied in φ 3 SAT . A i is the number of clauses learned by CSA and N C is the number of clauses in φ 3 SAT .
f β i = i = 1 N C A i
where
A i = {   1   ,   T r u e   0   ,   F a l s e
given that f β i .
Stage 3: Proliferation via Cloning
The top five β with higher affinity were chosen to proliferate in cloning process. In this process, β will be duplicated by applying the roulette wheel mechanism [59]. The number of cloned β , N ε will be computed by using Equation (12).
N ε = λ f i i = 1 N C f β i
where λ , known as initial affinity, is the population clone size which the software seeks to implement into the searching space. [58] suggested selecting λ = 200 .
Stage 4: Normalization
Equation (13) shows the list of cloned β ; 1 β i C N ε .
β i C = { β 1 C , β 2 C , β 3 C , , β N ε C }
Normalization of β i C ( β ˜ i C ) is often called immune response maturation throughout the system. It is important to normalize the β i C before proceeding to the next step. Next, we calculate the affinity for each β ˜ i C , which is abbreviated as f β ˜ i C .
f β ˜ i C = f β i C min | f β i C | max | f β i C | min | f β i C |
where
max | f β i C | min | f β i C | > 0
Note that max | f β i C | min | f β i C | because the probability of getting f β i C = 0 is almost zero.
Stage 5: Somatic Hypermutation
The somatic hypermutation process is significant since it will ensure the β to achieve highest affinity which results in a feasible solution. Equation (16) shows the calculation of the number of mutations ( N ζ ) for each β ˜ i C .
N ζ = ( f β ˜ i C η ) + θ ( 1 f β ˜ i C )
where η is the number of variables in φ 3 SAT , θ = 0.01 and η 0 [58]. For every mutation that occurs in N ζ , one or more β will be flipped from 1 to −1 or vice versa.
Finally, the f β i of mature population will be computed and we will choose the best β as the candidate cell to be kept in the memory cell. The solution will be selected if f β i C = N C . On the other hand, the process will repeat from stages 2 to 5 if f β i C N C . Figure 2 shows the summary of all steps involved in CSA.

5. 3-Satisfiability Based Reverse Analysis Method

Logic mining is a process that utilizes logic programming to extract information from a data set. In this regard, this section will explain how the logic mining tool named 3-Satisfiability-based Reverse Analysis Method (3-SATRA) is implemented in our DHNN3-SATCSA model to extract the relationship of AERA entries. Consider n attributes of the data sets ( S 1 , S 2 , S 3 , , S n ) , where S i { 1 , 1 } . Note that all binary representations must be represented in terms of bipolar states. Since this paper investigates φ 3 SAT , the arrangement of each A m consists of S i , S j , S k where i j k . Note that N C is the number of clauses in φ 3 SAT . For A m that leads φ 3 SAT learn = 1 , we assign
A m = ( S i max [ n   ( S i ) ] S j max [ n   ( S j ) ] S k max [ n   ( S i ) ] )
Note that max [ n ( S i ) ] signifies the highest frequency of S i . In this case, each S i is given as follows:
S i = { S i   ,   S i = 1 S ¯ i   ,   S i = 1
By using the obtained A m , we can formulate φ 3 SAT best :
φ 3 SAT best = m = 1 N C A m
For example, we will choose A 1 = ( S 1 S ¯ 2 S ¯ 3 ) if S 1 max [ n   ( S 1 ) ] = S 1 , S 2 max [ n   ( S 2 ) ] = S ¯ 2 and S 3 max [ n   ( S 3 ) ] = S ¯ 3 . Next, φ 3 SAT best will be embedded into DHNN. Henceforth, we will obtain the states of S i that correspond to E φ 3 SAT best = 0 . By comparing Equation (4) with Equation (8), the corresponding W i j will be obtained. During the testing phase, the induced states, S i B , will be obtained by using Equation (6). Subsequently, the induced logic, φ i B will be constructed based on the rule given in Equation (2). Finally, the chosen induced logic is obtained based on φ i B = φ i test (testing data). Figure 3 demonstrates how 3-SATRA was implemented in the DHNN model. In this paper, we will represent each neuron with entries of AERA.

6. Experimental Setup

A standard procedure among ANN practitioners is to investigate the proposed model with other comparative studies. Therefore, the simulation process is divided into three sections. Firstly, the performance of DHNN3-SATCSA is analyzed by using simulated data sets. In this case, the ability of CSA in the learning phase of the proposed model will be compared with other existing methods [58,60]. Secondly, several benchmark data sets will be implemented into DHNN3-SATCSA. The comparison of the retrieval properties of DHNN3-SATCSA will be also evaluated based on φ i B = φ i test . The third section presents the implementation of AERA into the proposed model. All real-life data sets that were converted to bipolar representation and information extraction will be conducted via 3-SATRA and incorporated with DHNN3-SAT models.
In the first section, DHNN with linearized initial neuron states might result in the biasedness of the retrieval state because the network simply memorizes the final state without producing a new state [61]. Therefore, possible positive and negative biases can be reduced by generating all the neuron states randomly as in Equation (20):
S i ( t ) = { 1   , r a n d [ 0 , 1 ] < 0.5 1   , otherwise
where S i is defined as in Equation (3). The simulated data set will be initiated by generating randomized clauses and literals for each φ 3 SAT . A similar approach has been implemented in several studies such as [19,60] in generating the initial neuron states. It is worth mentioning that all simulations will be measured against existing methods by evaluating appropriate performance evaluation metrics. By quoting several relevant studies that implemented such experimentations [35,49,62], the proposed performance metrics in this experiment are mean absolute error (MAE), sum of square error (SSE), global minima ratio ( ω ) , accuracy in percentage ( α ) and computational time in SI unit of second ( C T ) . According to [63], MAE computes the average absolute error of the fitness during the learning phase in our proposed model. The formulation of MAE is as follows:
M A E = i = 1 n 1 n | τ υ |
where τ and υ are the total number of clauses and the number of satisfied clauses in φ 3 SAT respectively. In relation to Equation (21), the accumulation of errors in each model can be also effectively evaluated by using SSE. The formulation of SSE is described by the following equation:
S S E = i = 1 n ( υ τ ) 2
On the other hand, we examine the final neuron states of the proposed model via ω .
ω = 1 a b i n N L φ 3 SAT
According to [52], if the final neuron states of the proposed model is E φ 3 SAT 0 , the model will prone to ω 1 . Hence, the best model will attain the lowest value of MAE and SSE with ω 1 . Notably, ω 1 indicates n ( | L φ 3 SAT min L φ 3 SAT | λ ) a b where a and b are number of trials and neuron combinations, respectively. In another development, we utilize the value of α and C T to investigate the effectiveness and efficiency of 3-SATRA in the testing phase of DHNN3-SATCSA. We describe two formulations:
α = φ i B N φ i test × 100 %
C T = LearningTime   ( s ) + RetrievalTime   ( s )
Note that α 100 if n ( φ i test = φ i B ) G , where G in our case is depicted by 40 % of instances in a data set. In practice, the best model requires α 100 and the minimum value of C T . Learning time and retrieval time are denoted as total time executed by DHNN3-SAT models in the learning phase and retrieval phase respectively. Table 2 and Table 3 discuss the parameters involved in hybrid Hopfield Neural Network with Exhaustive Search (DHNN3-SATES) and hybrid model with Clonal Selection Algorithm (DHNN3-SATCSA) respectively.
The choice of β is important as a large population size requires a large searching space of the solutions, which may increase the computational cost. On the other hand, a small β can lead to local minima solutions. According to [64], we should choose β = 100 as it is repeated to achieve a good result. The general implementation of the proposed model in a simulated data set can be summarized in Figure 4. 3-SATRA is implemented to show the level of connectedness between W i j and neurons. Overall, simulated and real-life data sets will be implemented into DHNN3-SATCSA. The computational simulation for both data sets was conducted on Dev C++ Version 5.11 for Windows 7 in 2GB RAM with Intel Core I3. As for the simulated data set, the Dev C++ program will generate the initial bipolar data randomly. Throughout the simulations, the same device is being used to avoid any biases. On the whole, all simulations are utilized with different number of neurons ( N N ), which is within the bound of not exceeding the threshold time of 24 h [35]. Note that the proposed model will randomly select nine attributes for the real-life data set as well as their arrangements in φ 3 SAT logical rule.

7. Results and Discussion

7.1. Simulated Data Set

The first section of the experiment was carried out by using simulated data. This section evaluates the performance of CSA as the learning rule in the DHNN model in comparison with ES. The findings of simulated data set for both models are presented as follows:
According to Figure 5 and Figure 6, DHNN3-SATCSA accumulated fewer errors compared to DHNN3-SATES due to CSA’s ability to learn and train the network effectively. However, ES incorporates random search which makes the complexity of the learning phase increase. As illustrated in Figure 7, DHNN3-SATCSA achieved a consistent value of ω = 1 , from N N = 9 to N N = 72 whereas DHNN3-SATES only gained a better value of ω after processing 62.5 % of the total N N . ES projected unnecessary projection due to the “trial-and-error” feature that does not aid the proposed model to improve the solutions. CSA can manage a large number of constraints compared to ES. CSA made this possible because CSA showcased the ability of β in fighting the pathogens and improving the affinity values in the entire bit strings to help DHNN3-SATCSA search for ideal solutions. In this experiment, we did not consider α because the value of ω corresponds to the number of global minimum energy achieved by DHNN3-SAT models. Hence, the value of ω is adequate to represent the effectiveness of the retrieval phase of both models. The main distinction between these models with [47] is the formulation of fitness function. The cost function in [47] is E φ 3 SAT 0 because the structure of φ 3 SAT is not satisfiable. CSA reduces the number of iterations because the CSA optimization operator, particularly somatic hypermutation, will allow the solution to attain E φ 3 SAT = 0 faster than ES. In general, CSA will reduce learning time, which will elongate relaxation time within the ideal rate and, we believe, result in less neuron oscillation. It is worth noting that the probability for somatic hypermutation to flip the neurons entirely is approaching to 0. Thus, the chances for the solution to achieve nonimproving fitness will reduce drastically compared to conventional ES. The Wan Abdullah method is chosen because this method is reported to contribute less neuron oscillation as compared to other methods such as Hebbian learning [65]. Uncontrollable neuron oscillation via other methods such as Hebbian learning will lead to more local minimum energy or | L φ 3 SAT min L φ 3 SAT | > λ . This comparison is vital to validate the learning capabilities of CSA. The limitation of DHNN3-SATCSA is the use of bipolar neuron states instead of another neuron representation (ternary), S i { 1 , 1 , 0 } . Ternary representation can provide more analysis since it has another vector of 0 which indicates no response or meaningless result. In another perspective, the proposed model only considers satisfiable SAT logic. Other SAT representations such as MAXk-SAT [18] require major restructuring, especially in terms logical redundancy. Furthermore, this experiment only employs a nonrestricted learning environment where the CSA and ES will iterate until E φ 3 SAT = 0 . Finally, this work only embeds φ 3 SAT in terms of CNF form. According to [66], CNF representation is more compatible to the WA method compared to Disjunctive Normal Form (DNF) representation.

7.2. Benchmark Data Sets

For the second part of the experiment, the simulation is carried out over a set of four widely used benchmark real-life data sets [67] listed in Table 4. Note that this section evaluates the performance of DHNN-3SAT models in doing real life data sets. A benchmark data set is reported in this paper because these structured data sets are validation performances of the DHNN-3SAT models.
All attributes (consisting of nine literals for each data set) listed in Table 4 will be embedded into 3-SATRA using implementation in Figure 3. We choose data set from the different disciplines because each data set has different clustering behaviour. The objective of each DHNN3-SAT model in this section is to induce the best φ i B that classifies the outcome of the data sets. In general, the choice of outcome for each data set is given as follows:
  • BDMC: Client will subscribe a term deposit where 1 and −1 signify nonsubscription and subscription respectively.
  • CCDP: Response to default payment of credit card customers where 1 and −1 signifies nonpaymaster and paymaster respectively.
  • DRDD: Signs of diabetic retinopathy where 1 and −1 signifies the sign exist and nonexist respectively.
  • FLST: Customers interest where 1 and −1 signifies the show of noninterest and interest towards the product respectively.
In this section, we only evaluate the performance of the induced φ i B and disregard the result from learning error. The instances of the data sets will be divided into φ i learn (60%) and φ i test (40%) which follows the procedure of the logic-mining model proposed by Kho [35]. We found that more capacity for learned data (than the proposed proportion) will result in data overfitting. Thus, the best DHNN-3SAT model is measured based on the highest value of α .
The value of α for all models is shown in Figure 8, Figure 9, Figure 10 and Figure 11. A higher value of α indicates the optimality of the model in retrieving φ i B . The results of the analyses discussed in Figure 8, Figure 9, Figure 10 and Figure 11 are all based on the assumption that φ i B for all data sets had achieved ω = 1 . According to Figure 8, both models demonstrate the similar maximum value of α = 74 % for N N = 72 in the BDMC data set. Despite the similar value of α for both models at N N = 72 , DHNN-3SATES reported high learning error compared to DHNN-3SATCSA. The small value of α for DHNN-3SATCSA at 10 N N 60 is due to overfitting the solution during the retrieval phase of DHNN-3SAT. As seen in Figure 9, the overall trend of α is distinct where the proposed model achieved consistent value of 77 % and 36 % for all N N respectively. In Figure 10 and Figure 11, the proposed DHNN-3SATCSA is reported to not be effective when the N N is small, although the α reached the maximum value at N N 15 . Despite a similar value of α for both models in Figure 11, DHNN-3SATCSA achieved similar α in lower learning error. As observed in Figure 8, Figure 9, Figure 10 and Figure 11, the proposed DHNN-3SATCSA in 3-SATRA exhibits competitive performance with respect to the learning error and α . The innovation of DHNN-3SATCSA lies in the solution diversity of β that prevents CSA from getting trapped in local minima energy. In this case, a promising β will be improved via hypermutation strategy during the learning phase of DHNN. In contrast, DHNN-3SATES has no optimization layer and in most cases will contribute to suboptimal φ i B (see Figure 9). We expect that DHNN-3SATES will exceed the threshold computational time when N N > 88 due to the structural limitation of ES. Hence, we can further agree that generally DHNN3-SATCSA is a better model in terms of α and the capability of its mechanism to employ different sizes and natures of real-life data sets. We can further improve the retrieval property of DHNN-3SATCSA by implementing a mutation operator such as in [19].
Table 5 extends the experiment by comparing benchmark data sets with other existing methods which comprises conventional statistical methods such as decision tree (DT), naïve Bayes (NB), support vector machine (SVM). The work of [68] utilized BDMC to predict the successful direct marketing campaign that ensures customers subscribing to a term deposit plan by using DT analysis. Our proposed model achieved better α with differences of 12.73 % . On the other hand, the α attained by [69] for CCDP is excessively lower than our proposed model. [69] applied an NB classifier to provide information for risk management of handling customers with credit risks. In [70], this work applies SVM analysis with a confusion matrix to accentuate feature selection and classification. However, the α gained for the SVM method is 25.3 % lower than DHNN3-SATCSA. As for FLST, there is no comparable recent work that utilizes this data set.
Note that the proposed model does not consider the effect of attribute permutation. This straightforward φ i learn implementation helps us to effectively determine attributes in the induced logic φ i B whenever we convert it to other logic programming form. It is worth mentioning that this simulation only considers attributes that lead to φ 3 SAT learn = 1 , because the proposed model aims to minimize the E φ 3 SAT = 0 . Since there are no redundant attributes in 3SATRA, the satisfiability aspect of φ i B can be guaranteed. In this case, the structure of φ i learn should be modified into nonsatisfiability logic such as maximum satisfiability [58]. By that standard, we expected DHNN-3SATCSA will outperform DHNN-3SATES if E φ 3 SAT 0 is considered in 3-SATRA. In addition, the proposed DHNN-3SAT model does not consider noise function such as in the work of [19]. Thus, the result from this section is important as the φ i B can be easily analysed by the practitioners as compared to relying entirely on error analysis. Through our findings of simulated and benchmark data sets, we further experiment the competency of DHNN3-SATCSA in entrenching AERA by analysing several performance evaluation metrics.

7.3. Amazon Employees Resources Access Data Set

7.3.1. Performance of DHNN3-SAT in Learning and Testing Phase

From the previous section, we can conclude that the proposed model is suitable to implement in AERA. Therefore, this section investigates the behaviour of DHNN3-SATCSA analysing AERA to help benefit Amazon.com Inc. Both models utilized the 3-SATRA method. However, our main contribution is to investigate CSA capability to enhance the learning mechanism of DHNN. This is to ensure an optimal learning environment. Relative to the experiment, the key findings of the attained φ best B will also be presented in this section.
In Figure 12 and Figure 13, both the MAE and SSE of DHNN3-SATCSA attained consistent value of errors, approaching to 0, whereas, for DHNN3-SATES, the errors are gradually increasing. Particularly for DHNN3-SATCSA, less accumulation of errors is due to the CSA mechanism improving the quality of solutions in order to attain E φ 3 SAT = 0 . However, ES generated larger value of errors because the ES mechanism is only effective with low N N . We illustrate the capability of the retrieval properties of DHNN3-SAT models based on Figure 14 and Figure 15. Overall, the value of α obtained by DHNN3-SATCSA is relatively higher by at most 3 % compared to DHNN3-SATES. Conversely, we also compare the α obtained by other existing work such as [4] that also utilized AERA with conventional statistical methods such as LR, GB and RF. Due to the imbalanced entries of AERA, the work also mentioned their efforts of constructing a prediction model by trying out single models on categorical data. Subsequently, it was improved by introducing various modified methods of decision trees to finally get the desired α . A summary of α achieved by DHNN3-SATCSA with all comparative methods are shown in Table 6. From N N = 9 up to N N = 36 , the C T recorded for both models have a similar rate. However, from N N = 45 onwards, DHNN3-SATES required more C T . The apparent reason of why DHNN3-SATES needed more time compared to DHNN3-SATCSA is because the ES mechanism leads to a property of entire bit strings of logical rule collapsing when any of the clauses is not satisfied, thus more iterations are required to produce a plausible solution. That is unlike CSA’s ability to minimize iterations in the completion of learning process due to its optimization operator [64].

7.3.2. Key Findings of φ best B

Equation (26) shows the attained φ i B at the highest α ( φ best B ) by DHNN3-SATCSA. The generated φ best B will help Amazon.com Inc. in identifying insignificant factors to improve the human resources management. Table 7 shows the details of AERA utilized in this experiment.
φ best B = ( P Q ¯ R ) ( S ¯ T U ¯ ) ( V ¯ W X ¯ )
Equation (26) gives information of which attributes carry a trivial role in Amazon.com Inc. employees resources applications. We recognized the negation of literals in Equation (26) as a factor that does not affect the problem faced by Amazon.com Inc. For example, Q ¯ indicates a manager’s unnecessary role to grant resources application. This is believed to add more pointless human administration to solve the employees’ complications regarding their resources. In addition, attributes like P will influence the application process as the availability of resources is required to be known for sufficient needs of all employees. P also provides resources information to other departments like the operations and maintenance departments to manage defective equipment and appliances. R and S ¯ are correlated. However, it shows clearly different major levels of management, such as top-level, middle-level and first-level are crucial in deciding which resources are first in line. The example of roles related to R and S ¯ are engineers and retailers respectively, thus Equation (26). We can conclude that engineers should be prioritized first compared to retailers as Amazon.com Inc. emerges as a well-known tech giant. Furthermore, Amazon.com Inc. should prioritize T to decide which departments are more vital and need new resources to accomplish their tasks in the company. Amazon.com Inc. have to underline certain standards to maintain the quality of work from certain departments that hold a greater role in the norm business of Amazon.com Inc. Consequently, factors like U ¯ clearly show the insignificant need of considering the business title of an employee in order to grant or revoke their employees resources applications. Top management of Amazon.com Inc. could improve business personnel in a value-added business to other employees in their business duties. The attributes of V ¯ and W are related to one another, however, the difference is the specification of an employee role. V ¯ is the extended version or additional role given to an employee, by referring to the attained φ best B . We can deduce that Amazon.com Inc. should only consider employees’ main role in the company to prioritize the resources applications. Attribute X ¯ is only essential when Amazon.com Inc. recollects which role is open to vacancy and does not affect much in resources management of Amazon.com Inc.
In line with the no free lunch theorem [72], it is impractical to propose a specific algorithm or model which claims to solve all real-life applications. Thus, new developments on improving metaheuristics and optimization model are continuously needed to handle particular optimization tasks. This work focused on DHNN3-SATCSA transforming AERA into 3-SAT logic representation with 3-SATRA to generate optimum φ best B to extract information from AERA. On the other hand, [73] reports that the CSA mechanism computational time may take longer because the number of affinity evaluations is increasing as the population of β increases. Nonetheless, the φ best B attained from DHNN3-SATCSA may help provide Amazon.com Inc. an alternative model to predict resources applications of the future set of employees. Furthermore, DHNN3-SATCSA could be tested by implementing other types of optimization problems from other companies, such as Walmart’s efforts to reduce food waste through distribution processes or Ikea’s attempts to scale up their system of product fault detection. The implementation of DHNN3-SATCSA will provide beneficial information to a company that wishes to know which factors are more significant than others and which could lead to better control and management of their production.

8. Conclusions

In conclusion, we believe the findings of this study will broaden fundamental optimization methods, such as statistical methods or conventional evolutionary algorithms. In this experiment, the incorporation of 3-SAT in DHNN was crucial to exhibit the relationship and behaviour of AERA symbolically. In addition, 3-SATRA was developed in this study to extract information from AERA, despite its large size with imbalanced entries. Subsequently, 3-SATRA is vital to generate induced logics which displayed insignificant factors in AERA that lead to the problem faced by Amazon.com Inc. Also, the construction of our modified DHNN3-SAT model integrated with modified CSA was revealed to be useful to improve the traditional learning phase of DHNN. In addition, we demonstrated the competency of our hybrid DHNN model of DHNN3-SATCSA by entrenching three different data sets: simulated, benchmark and AERA, in comparison with other existing methods. The comparative investigation was executed by employing various performance evaluation metrics. The findings showed DHNN3-SATCSA outperformed other existing methods. In order to construct a possible model that can cater to all optimization tasks, further improvement of the proposed model could be done to improve the performance and mechanism of the model by implementing a mutation feature in the testing phase of DHNN. Therefore, the exploration of the testing phase in DHNN is worthy of attention, alongside future research addressing the variability of implementing other algorithms to enhance the mechanism of modified DHNN models.

Author Contributions

Conceptualization, methodology, resources, A.A.; validation, writing—original draft preparation, project administration, N.E.Z.; formal analysis, M.S.M.K.; writing—review and editing, S.Z.M.J., visualization, S.A.A.; funding acquisition, M.A.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Fundamental Research Grant Scheme (FRGS), Ministry of Education Malaysia, grant number 203/PJJAUH/6711751 and Universiti Sains Malaysia.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Clement, J. Number of Amazon.com Employees from 2007 to 2019. Available online: https://www.statista.com/statistics/234488/number-of-amazon-employees/ (accessed on 9 April 2020).
  2. Marques, P.H.; Atouguia, J.; Marques, F.H.; Palhais, C.; Pinto, A.R.; Silva, L.A. (Eds.) A study of associations of occupational accidents to number of employees, and to hours worked. In Occupational Safety and Hygiene II; CRC: London, UK, 2014; pp. 663–667. [Google Scholar]
  3. Gupta, A.K.; Singhal, A. Managing human resources for innovation and creativity. Res.-Technol. Manag. 1993, 36, 41–48. [Google Scholar] [CrossRef]
  4. Tang, S.; Han, J.B.; Zhang, Y. Amazon Employee Access Control System. Available online: https://www.semanticscholar.org/paper/Amazon-Employee-Access-Control-System-Tang-Han/e419470b5d9808a8a8bfee41f4e7d8ffff15eb53 (accessed on 27 January 2020).
  5. Gonzalez-Fernandez, I.; Iglesias-Otero, M.A.; Esteki, M.; Moldes, O.A.; Mejuto, J.C.; Simal-Gandara, J. A critical review on the use of artificial neural networks in olive oil production, characterization and authentication. Crit. Rev. Food Sci. Nutr. 2019, 59, 1913–1926. [Google Scholar] [CrossRef]
  6. Parsaeimehr, E.; Fartash, M.; Torkestani, J.A. An Enhanced Deep Neural Network-Based Architecture for Joint Extraction of Entity Mentions and Relations. Int. J. Fuzzy Log. Intell. Syst. 2020, 20, 69–76. [Google Scholar] [CrossRef]
  7. Sameen, M.I.; Pradhan, B.; Lee, S. Application of convolutional neural networks featuring Bayesian optimization for landslide susceptibility assessment. Catena 2020, 186, 104249. [Google Scholar] [CrossRef]
  8. Su, L.; Deng, L.; Zhu, W.; Zhao, S. Statistical detection of weak pulse signal under chaotic noise based on Elman neural network. Wirel. Commun. Mob. Comput. 2020, 2020, 9653586. [Google Scholar] [CrossRef] [Green Version]
  9. Mansor, M.A.; Kasihmuddin, M.S.M.; Sathasivam, S. Enhanced Hopfield network for pattern satisfiability optimization. Int. J. Intell. Syst. Appl. 2016, 8, 27. [Google Scholar] [CrossRef]
  10. Fu, Y.; Aldrich, C. Flotation froth image recognition with convolutional neural networks. Miner. Eng. 2019, 132, 183–190. [Google Scholar] [CrossRef]
  11. Hou, J.; He, Y.; Yang, H.; Connor, T.; Gao, J.; Wang, Y.; Zeng, Y.; Zhang, J.; Huang, J.; Zheng, B.; et al. Identification of animal individuals using deep learning: A case study of giant panda. Biol. Conserv. 2020, 242, 108414. [Google Scholar] [CrossRef]
  12. Benitez-Garcia, G.; Haris, M.; Tsuda, Y.; Ukita, N. Finger gesture spotting from long sequences based on multi-stream recurrent neural networks. Sensors 2020, 20, 528. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Choi, H.S.; Park, Y.J.; Lee, J.H.; Kim, Y. 3-D synapse array architecture based on charge-trap flash memory for neuromorphic application. Electronics 2020, 9, 57. [Google Scholar] [CrossRef] [Green Version]
  14. Yang, C.; Kim, H.; Adhikari, S.P.; Chua, L.O. A circuit-based neural network with hybrid learning of backpropagation and random weight change algorithms. Sensors 2017, 17, 16. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Hopfield, J.J.; Tank, D.W. “Neural” computation of decisions in optimization problems. Biol. Cybern. 1985, 52, 141–152. [Google Scholar] [PubMed]
  16. Duan, L.; Duan, F.; Chapeau-Blondeau, F.; Abbott, D. Stochastic resonance in Hopfield neural networks for transmitting binary signals. Phys. Lett. A 2020, 384, 126143. [Google Scholar] [CrossRef]
  17. Kong, D.; Hu, S.; Wang, J.; Liu, Z.; Chen, T.; Yu, Q.; Liu, Y. Study of recall time of associative memory in a memristive Hopfield neural network. IEEE Access 2019, 7, 58876–58882. [Google Scholar] [CrossRef]
  18. Kasihmuddin, M.S.M.; Mansor, M.A.; Sathasivam, S. Discrete Hopfield neural network in restricted maximum k-satisfiability logic programming. Sains Malays. 2018, 47, 1327–1335. [Google Scholar] [CrossRef]
  19. Kasihmuddin, M.S.M.; Mansor, M.A.; Basir, M.F.M.; Sathasivam, S. Discrete mutation Hopfield neural network in propositional satisfiability. Mathematics 2018, 7, 1133. [Google Scholar] [CrossRef] [Green Version]
  20. Abdullah, W.A.T.W. The logic of neural networks. Phys. Lett. A 1993, 176, 202–206. [Google Scholar] [CrossRef]
  21. Soeken, M.; Meuli, G.; Schmitt, B.; Mozafari, F.; Riener, H.; De Micheli, G. Boolean satisfiability in quantum compilation. Philos. Trans. R. Soc. A 2020, 378, 20190161. [Google Scholar] [CrossRef] [Green Version]
  22. Gebregiorgis, A.; Tahoori, M.B. Test Pattern Generation for Approximate Circuits Based on Boolean Satisfiability. In Proceedings of the 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 25–29 March 2019; pp. 1028–1033. [Google Scholar]
  23. Hireche, C.; Drias, H.; Moulai, H. Grid based clustering for satisfiability solving. Appl. Soft Comput. 2020, 88, 106069. [Google Scholar] [CrossRef]
  24. Yamashita, H.; Aihara, K.; Suzuki, H. Timescales of Boolean satisfiability solver using continuous-time dynamical system. Commun. Nonlinear Sci. Numer. Simul. 2020, 84, 105183. [Google Scholar] [CrossRef]
  25. Weissenbacher, G.; Malik, S. Post-silicon fault localization with satisfiability solvers. In Post-Silicon Validation and Debug; Mishra, P., Farahmandi, F., Eds.; Springer: Cham, Switzerland, 2019; pp. 255–273. [Google Scholar]
  26. Abdullah, W.A.T.W. Logic programming on a neural network. Int. J. Intell. Syst. 1992, 7, 513–519. [Google Scholar] [CrossRef]
  27. Sathasivam, S. First order logic in neuro-symbolic integration. Far East J. Math. Sci. 2012, 61, 213–229. [Google Scholar]
  28. Čepek, O.; Kučera, P. Known and new classes of generalized Horn formulae with polynomial recognition and SAT testing. Discret. Appl. Math. 2005, 149, 14–52. [Google Scholar] [CrossRef] [Green Version]
  29. Kasihmuddin, M.S.M.B.; Mansor, M.A.B.; Sathasivam, S. Genetic algorithm for restricted maximum k-Satisfiability in the Hopfield network. Int. J. Interact. Multimed. Artif. Intell. 2016, 4, 52–60. [Google Scholar]
  30. Kasihmuddin, M.S.M.; Mansor, M.A.; Sathasivam, S. Hybrid genetic algorithm in the Hopfield network for logic satisfiability problem. Pertanika J. Sci. Technol. 2017, 25, 139–152. [Google Scholar]
  31. Kasihmuddin, M.S.M.; Sathasivam, S.; Mansor, M.A. Hybrid Genetic Algorithm in the Hopfield Network for Maximum 2-Satisfiability Problem. In Proceedings of the 24th National Symposium of Mathematical Sciences: Mathematical Sciences Exploration for the Universal Preservation, Kuala Terengganu, Malaysia, 27–29 September 2016; p. 050001. [Google Scholar]
  32. Sathasivam, S.; Abdullah, W.A.T.W. Logic mining in neural network: Reverse analysis method. Computing 2011, 91, 119–133. [Google Scholar] [CrossRef]
  33. Sathasivam, S. Application of Neural Networks in Predictive Data Mining. In Proceedings of the 2nd International Conference on Business and Economic, Kedah, Malaysia, 14–16 March 2011; pp. 371–376. [Google Scholar]
  34. Kasihmuddin, M.S.M.; Mansor, M.A.; Sathasivam, S. Students’ Performance via Satisfiability Reverse Analysis Method with Hopfield Neural Network. In Proceedings of the International Conference on Mathematical Sciences and Technology 2018 (MATHTECH2018): Innovative Technologies for Mathematics and Mathematics for Technological Innovation, Penang, Malaysia, 10–12 December 2018; p. 060035. [Google Scholar]
  35. Kho, L.C.; Kasihmuddin, M.S.M.; Asyraf, M. Logic mining in league of legends. Pertanika J. Sci. Technol. 2020, 28, 211–225. [Google Scholar]
  36. Kho, L.C.; Kasihmuddin, M.S.M.; Mansor, M.A.; Sathasivam, S. Logic mining in football matches. Indones. J. Electr. Eng. Comput. Sci. 2020, 17, 1074–1083. [Google Scholar] [CrossRef] [Green Version]
  37. Gendreau, M.; Potvin, J.Y. Handbook of Metaheuristics, 2nd ed.; Springer: New York, NY, USA, 2010. [Google Scholar]
  38. Göçken, M.; Özçalıcı, M.; Boru, A.; Dosdoğru, A.T. Integrating metaheuristics and artificial neural networks for improved stock price prediction. Expert Syst. Appl. 2016, 44, 320–331. [Google Scholar] [CrossRef]
  39. Nievergelt, J. Exhaustive Search, Combinatorial Optimization and Enumeration: Exploring the Potential of Raw Computing Power. In Proceedings of the 27th Conference on Current Trends in Theory and Practice of Informatics, Milovy, Czech Republic, 25 November–2 December 2000; pp. 18–35. [Google Scholar]
  40. Reunanen, J. Overfitting in making comparisons between variable selection methods. J. Mach. Learn. Res. 2003, 3, 1371–1382. [Google Scholar]
  41. Hooker, J.N. Unifying Local and Exhaustive Search. In Proceedings of the ENC 2005—Sixth Mexican International Conference on Computer Science, Puebla, Mexico, 26–30 September 2005; pp. 237–243. [Google Scholar]
  42. Luo, W.; Lin, X. Recent Advances in Clonal Selection Algorithms and Applications. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; pp. 1–8. [Google Scholar]
  43. Mohammed, M.A. Compare between genetic algorithm and clonal selection algorithm to pattern recognition Latin’s numbers. J. Educ. Sci. 2019, 28, 300–315. [Google Scholar] [CrossRef] [Green Version]
  44. Dasgupta, K.; Roy, P.K.; Mukherjee, V. Power flow based hydro-thermal-wind scheduling of hybrid power system using sine cosine algorithm. Electr. Power Syst. Res. 2020, 178, 106018. [Google Scholar] [CrossRef]
  45. Silva, G.C.; Carvalho, E.E.; Caminhas, W.M. An artificial immune systems approach to case-based reasoning applied to fault detection and diagnosis. Expert Syst. Appl. 2020, 140, 112906. [Google Scholar] [CrossRef]
  46. Zhang, W.; Zhang, W.; Yen, G.G.; Jing, H. A cluster-based clonal selection algorithm for optimization in dynamic environment. Swarm Evol. Comput. 2019, 50, 100454. [Google Scholar] [CrossRef]
  47. Layeb, A. A clonal selection algorithm based tabu search for satisfiability problems. J. Adv. Inf. Technol. 2012, 3, 138–146. [Google Scholar] [CrossRef]
  48. Cook, S.A. The Complexity of Theorem-Proving Procedures. In Proceedings of the STOC ’71: Third Annual ACM Symposium on Theory of Computing, Shaker Heights, OH, USA, 3–5 May 1971; pp. 151–158. [Google Scholar]
  49. Mansor, M.A.; Kasihmuddin, M.S.M.; Sathasivam, S. Modified artificial immune system algorithm with Elliot Hopfield neural network for 3-satisfiability programming. J. Inform. Math. Sci. 2019, 11, 81–98. [Google Scholar]
  50. Fitzsimmons, M.; Kunze, H. Combining Hopfield neural networks, with applications to grid-based mathematics puzzles. Neural Netw. 2019, 118, 81–89. [Google Scholar] [CrossRef]
  51. Katayama, K.; Horiguchi, T. Generalization ability of Hopfield neural network with spin-S Ising neurons. J. Phys. Soc. Jpn. 2000, 69, 2816–2824. [Google Scholar] [CrossRef]
  52. Sathasivam, S. Upgrading logic programming in Hopfield network. Sains Malays. 2010, 39, 115–118. [Google Scholar]
  53. Velavan, M.; Yahya, Z.R.B.; Halif, M.N.B.A.; Sathasivam, S. Mean field theory in doing logic programming using Hopfield network. Mod. Appl. Sci. 2016, 10, 154. [Google Scholar] [CrossRef]
  54. Kasihmuddin, M.S.B.M.; Sathasivam, S. Accelerating Activation Function in Higher Order Logic Programming. In Proceedings of the 23rd National Symposium of Mathematical Sciences (SKSM23), Johor Bahru, Malaysia, 24–26 November 2015; p. 030006. [Google Scholar]
  55. Moreno-Armendáriz, M.A.; Hagan, M.; Alba, E.; Rubio, J.D.J.; Cruz-Villar, C.A.; Leguizamón, G. Advances in neural networks and hybrid-metaheuristics: Theory, algorithms, and novel engineering applications. Comput. Intell. Neurosci. 2016, 2016, 3263612. [Google Scholar] [CrossRef] [PubMed]
  56. Rojas-Delgado, J.; Trujillo-Rasúa, R.; Bello, R. A continuation approach for training artificial neural networks with meta-heuristics. Pattern Recognit. Lett. 2019, 125, 373–380. [Google Scholar] [CrossRef]
  57. Castro, L.N.D.; Zuben, F.J.V. The Clonal Selection Algorithm with Engineering Applications. In Proceedings of the GECCO, Las Vegas, NV, USA, 8–12 July 2000; pp. 36–37. [Google Scholar]
  58. Mansor, M.A.B.; Kasihmuddin, M.S.B.M.; Sathasivam, S. Robust artificial immune system in the Hopfield network for maximum k-satisfiability. IJIMAI 2017, 4, 63–71. [Google Scholar] [CrossRef] [Green Version]
  59. Goldberg, D.E.; Deb, K. A comparative analysis of selection schemes used in genetic algorithms. Found. Genet. Algorithms 1991, 1, 69–93. [Google Scholar]
  60. Sathasivam, S.; Mamat, M.; Mansor, M.A.; Kasihmuddin, M.S.M. Hybrid discrete Hopfield neural network based modified clonal selection algorithm for VLSI circuit verification. Pertanika J. Sci. Technol. 2020, 28, 227–243. [Google Scholar]
  61. Ong, P.; Zainuddin, Z. Optimizing wavelet neural networks using modified cuckoo search for multi-step ahead chaotic time series prediction. Appl. Soft Comput. 2019, 80, 374–386. [Google Scholar] [CrossRef]
  62. Kathirvel, V.; Mansor, M.A.; Kasihmuddin, M.S.M.; Sathasivam, S. Hybrid imperialistic competitive algorithm incorporated with hopfield neural network for robust 3 satisfiability logic programming. IAES Int. J. Artif. Intell. 2019, 8, 144. [Google Scholar] [CrossRef]
  63. Wang, X.; Wang, J.; Fečkan, M. BP neural network calculus in economic growth modelling of the group of seven. Mathematics 2020, 8, 37. [Google Scholar] [CrossRef] [Green Version]
  64. Mansor, M.A.; Kasihmuddin, M.S.M.; Sathasivam, S. Artificial immune system paradigm in the Hopfield network for 3-satisfiability problem. Pertanika J. Sci. Technol. 2017, 25, 1173–1188. [Google Scholar]
  65. Sathasivam, S. Learning rules comparison in neuro-symbolic integration. Int. J. Appl. Phys. Math. 2011, 1, 129. [Google Scholar] [CrossRef] [Green Version]
  66. Sathasivam, S. Clauses Representation Comparison in Neuro-Symbolic Integration. In Proceedings of the World Congress on Engineering 2010 Vol 1 (WCE 2010), London, UK, 30 June–2 July 2010; pp. 34–37. [Google Scholar]
  67. Lichman, M. UCI Machine Learning Repository: University of California, School of Information and Computer Science. Available online: http://archive.ics.uci.edu/ml (accessed on 9 April 2020).
  68. Rogic, S.; Kascelan, L. Customer Value Prediction in Direct Marketing Using Hybrid Support Vector Machine Rule Extraction Method. In Proceedings of the European Conference on Advances in Databases and Information Systems (ADBIS 2019), Bled, Slovenia, 8–11 September 2019; Springer: Cham, Switzerland, 2019; pp. 283–294. [Google Scholar]
  69. Singh, B.E.R.; Sivasankar, E. Enhancing Prediction Accuracy of Default of Credit Using Ensemble Techniques. In Proceedings of the First International Conference on Artificial Intelligence and Cognitive Computing (AICC 2018), London, UK, 7 December 2018; Springer: Singapore, 2019; pp. 427–436. [Google Scholar]
  70. Oladele, T.O.; Ogundokun, R.O.; Kayode, A.A.; Adegun, A.A.; Adebiyi, M.O. Application of Data Mining Algorithms for Feature Selection and Prediction of Diabetic Retinopathy. In Proceedings of the International Conference on Computational Science and Its Application (ICCSA) 2019, Saint Petersburg, Russia, 1–4 July 2019; Springer: Cham, Switzerland, 2019; pp. 716–730. [Google Scholar]
  71. Amazon.com Employee Access Challenge. Available online: https://www.kaggle.com/c/amazon-employee-access-challenge (accessed on 9 April 2020).
  72. Wolpert, D.H.; Macready, W.G. No free lunch theorems for optimization. IEEE 1997, 1, 67–82. [Google Scholar] [CrossRef] [Green Version]
  73. Wang, Y.; Li, T. Local feature selection based on artificial immune system for classification. Appl. Soft Comput. 2020, 87, 105989. [Google Scholar] [CrossRef]
Figure 1. Implementation of the proposed model.
Figure 1. Implementation of the proposed model.
Entropy 22 00596 g001
Figure 2. Summary of Clonal Selection Algorithm (CSA).
Figure 2. Summary of Clonal Selection Algorithm (CSA).
Entropy 22 00596 g002
Figure 3. Implementation of 3-Satisfiability Reverse Analysis (3-SATRA) in the Discrete Hopfield Neural Network (DHNN).
Figure 3. Implementation of 3-Satisfiability Reverse Analysis (3-SATRA) in the Discrete Hopfield Neural Network (DHNN).
Entropy 22 00596 g003
Figure 4. The implementation of DHNN3-SAT models in a simulated data set.
Figure 4. The implementation of DHNN3-SAT models in a simulated data set.
Entropy 22 00596 g004
Figure 5. Mean absolute error (MAE) value of DHNN3-SAT models.
Figure 5. Mean absolute error (MAE) value of DHNN3-SAT models.
Entropy 22 00596 g005
Figure 6. Sum of square error (SSE) value of DHNN3-SAT models.
Figure 6. Sum of square error (SSE) value of DHNN3-SAT models.
Entropy 22 00596 g006
Figure 7. ω value of DHNN3-SAT models.
Figure 7. ω value of DHNN3-SAT models.
Entropy 22 00596 g007
Figure 8. α   ( % ) value of DHNN3-SAT models in the BDMC data set.
Figure 8. α   ( % ) value of DHNN3-SAT models in the BDMC data set.
Entropy 22 00596 g008
Figure 9. α   ( % ) value of DHNN3-SAT models in the CCDP data set.
Figure 9. α   ( % ) value of DHNN3-SAT models in the CCDP data set.
Entropy 22 00596 g009
Figure 10. α   ( % ) value of DHNN3-SAT models in the DRDD data set.
Figure 10. α   ( % ) value of DHNN3-SAT models in the DRDD data set.
Entropy 22 00596 g010
Figure 11. α   ( % ) value of DHNN3-SAT models in the FLST data set.
Figure 11. α   ( % ) value of DHNN3-SAT models in the FLST data set.
Entropy 22 00596 g011
Figure 12. MAE value of DHNN3-SAT models.
Figure 12. MAE value of DHNN3-SAT models.
Entropy 22 00596 g012
Figure 13. SSE value of DHNN3-SAT models.
Figure 13. SSE value of DHNN3-SAT models.
Entropy 22 00596 g013
Figure 14. α ( % ) value of DHNN3-SAT models.
Figure 14. α ( % ) value of DHNN3-SAT models.
Entropy 22 00596 g014
Figure 15. C T value of DHNN3-SAT models.
Figure 15. C T value of DHNN3-SAT models.
Entropy 22 00596 g015
Table 1. Example of cases for the 3-Satisfiability (3-SAT) logical rule, φ 3 SAT .
Table 1. Example of cases for the 3-Satisfiability (3-SAT) logical rule, φ 3 SAT .
Case φ 3 SAT   Instances Outcome
1 ( P , Q , R , S , T , U , V , W , X ) = ( 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) Satisfiable ( φ 3 SAT   = 1 )
2 ( P , Q , R , S , T , U , V , W , X ) = ( 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) Unsatisfiable ( φ 3 SAT   = 1 )
3 ( P , Q , R , S , T , U , V , W , X ) = ( 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) Full consistency ( φ 3 SAT   = 1 )
4 ( P , Q , R , S , T , U , V , W , X ) = ( 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) Full inconsistency ( φ 3 SAT   = 1 )
Table 2. List of parameters in DHNN3-SATES [52].
Table 2. List of parameters in DHNN3-SATES [52].
ParameterParameter Value/Remarks
a 100
b 100
λ 0.001
C T 24   h
N N 9 N N 72
Selection Rate 0.1
Number of Strings 100
Type of SelectionRandom
Table 3. List of parameters in DHNN3-SATCSA.
Table 3. List of parameters in DHNN3-SATCSA.
ParameterParameter Value/Remarks
n ( β ) 100
γ 200
θ 0.01
λ 0.001
C T 24   h
N N 9 N N 72
Type of SelectionRoulette Wheel Selection [59]
Learning MethodWA Method [20]
Table 4. List of benchmark data sets information.
Table 4. List of benchmark data sets information.
Benchmark Data Sets/FieldAttributesInstancesSources
Bank Direct Marketing Campaign (BDMC)/Marketing P : Age45,211UCI Machine Learning Repository
Q : Job
R : Credit card status
S : Housing loan
T : Personal loan
U : Last contact day of the month
V : Last contact duration
W : Number of days passed by after the client was last contacted from a previous campaign
X : Number of contacts performed before this campaign
Credit Card Default Payment (CCDP)/Finance P : Amount of limit balance3000UCI Machine Learning Repository
Q : Education
R : Marital status
S : History of repayment status in Month I
T : History of repayment status in Month II
U : Amount of bill statement in Month I
V : Amount of bill statement in Month II
W : Amount of previous payment in Month I
X : Amount of previous payment in Month II
Diabetic Retinopathy Debrecen Disease (DRDD)/Health P : Result of quality assessment1151UCI Machine Learning Repository
Q : Result of pre-screening
R : Features detection I
S : Features detection II
T : Features detection for exudates I
U : Features detection for exudates II
V : Affected patient condition according to the Euclidean Distance (center of the macula and the center of the optic disc)
W : Diameter of the optic disc
X : Result of the AM/FM- based classification
Facebook Live Sellers in Thailand (FLST)/Marketing P : Status type7050UCI Machine Learning Repository
Q : Number of comments
R : Number of shared post
S : Number of likes
T : Number of reaction; Love emoticon
U : Number of reaction; Wow emoticon
V : Number of reaction; “Haha” emoji
W : Number of reaction; “Sad” emoji
X : Number of reaction; “Angry” emoji
Table 5. α of DHNN3-SATCSA in comparison with other existing methods.
Table 5. α of DHNN3-SATCSA in comparison with other existing methods.
Data SetDHNN3-SATCSAES α /Method
BDMC 74 % 74 % 61.27 % /DT [68]
CCDP 77 % 36 % 66.32 % /NB [69]
DRDD 99 % 99 % 73.7 % /SVM [70]
FLST 88 % 88 % -
Table 6. α of DHNN3-SATCSA model in comparison with other existing methods.
Table 6. α of DHNN3-SATCSA model in comparison with other existing methods.
Method α
DHNN3-SATCSA 94 %
ES [65] 91 %
LR [1] 87.21 %
RF [1] 85.58 %
GB [1] 85.14 %
Table 7. List of information on Amazon Employees Resources Access (AERA) 2010–2011 data set.
Table 7. List of information on Amazon Employees Resources Access (AERA) 2010–2011 data set.
AttributesExampleInstances/Sources
P : An ID for each resourcesTypes of resources (computer, laptops, software)
Q : Manager employee IDSupervised or not supervised employee
R : Company role up category ID 1US Data Analyst
S : Company role up category ID 2US Manufacturing
T : Company role departmentManufacturing32,769/Kaggle Machine Learning and Data Science Community [71]
U : Company role business titleJunior Data Analyst, Senior Manufacturing Staff
V : Company role family extended descriptionSecurity Data Analyst, Product fault detection manufacturing staff
W : Company role family descriptionSecurity Data Analyst
X : Company role code (unique to each role)Data Analyst

Share and Cite

MDPI and ACS Style

Zamri, N.E.; Mansor, M.A.; Mohd Kasihmuddin, M.S.; Alway, A.; Mohd Jamaludin, S.Z.; Alzaeemi, S.A. Amazon Employees Resources Access Data Extraction via Clonal Selection Algorithm and Logic Mining Approach. Entropy 2020, 22, 596. https://doi.org/10.3390/e22060596

AMA Style

Zamri NE, Mansor MA, Mohd Kasihmuddin MS, Alway A, Mohd Jamaludin SZ, Alzaeemi SA. Amazon Employees Resources Access Data Extraction via Clonal Selection Algorithm and Logic Mining Approach. Entropy. 2020; 22(6):596. https://doi.org/10.3390/e22060596

Chicago/Turabian Style

Zamri, Nur Ezlin, Mohd. Asyraf Mansor, Mohd Shareduwan Mohd Kasihmuddin, Alyaa Alway, Siti Zulaikha Mohd Jamaludin, and Shehab Abdulhabib Alzaeemi. 2020. "Amazon Employees Resources Access Data Extraction via Clonal Selection Algorithm and Logic Mining Approach" Entropy 22, no. 6: 596. https://doi.org/10.3390/e22060596

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop