Robust modelling of binary decisions in Laplacian Eigenmaps-based Echo State Networks
Introduction
In artificial neural networks theory, different strategies can be followed for solving data classification which is at the basis of any decision-making problem. Even though in literature many efficient algorithms have been introduced, bio-inspired architectures cannot discard that decision making in living beings is not the result of a static processing, but involves a complex spatial–temporal dynamical process from which decisions emerge. Reservoir Computing (RC) aims to be a simple but powerful set of concepts and techniques for defining neural networks that can capture temporal dependencies by both allowing recurrent connections in the hidden layer and preserving internal state between inputs. Echo State Networks (ESNs) constitute a preeminent example of architecture belonging to the RC paradigm (Lukoševičius et al., 2012) and play an important role in plenty of applications, ranging from handwriting recognition (Bunke and Varga, 2007), to online learning control (Jordanou et al., 2019), to Autonomous Surface Vehicles (ASV) cooperation control (Liu et al., 2017). As regards nonlinear dynamical systems modelling and time series prediction, ESNs were successfully applied, for instance, to energy consumption estimation (Hu et al., 2020b) and to wind power generation forecasting adopting deep versions of ESNs (Hu et al., 2020a). Applications of ESNs, able to extract and predict the nonlinear characteristics of time series, and acting concurrently to ARMA models, able to estimate the linear part, were dealt with in Tian et al. (2020). ESNs were also successfully applied to chaotic time series prediction (Tian, 0000), using novel optimisation algorithms for the network parameters. Literature provides a whole and deep list of the ESN advantages, among which we can mention: the simplicity of the training method, real-time processing, global optimality and others. More specifically in relation to our contribution, in Tanaka et al. (2019) it is clearly stated, among the advantages of classical ESNs, the high dimensionality representation: it is possible to map inputs into a high-dimensional space. This property facilitates the separation of originally inseparable inputs in classification tasks and allows reading out spatiotemporal dependencies of inputs in prediction tasks. The main contribution of our paper exploits and enhances this aspects: it consists in designing a ESN-based structure able to provide a large number of decision making tasks in parallel, significantly reducing the network memory requirements, by modifying the weight data representation and, at the same time, without loosing in performances thanks to the intrinsic robustness of the proposed method to quantisation noise. This challenging objective has been obtained adopting the reservoir-based processing paradigm, formalised in terms of an Echo-State-Network, that has been enriched with a series of structural advantages. The first one is to include multiple readout maps on a unique reservoir. In this way in the paper it was demonstrated that a unique ESN reservoir layer is sufficient to allow the representation of all the binary functions with a fixed number of bits. This paradigm is inspired by the concept of neural reuse in neural psychology. In addition, a structural reduction method is applied to the readout maps to obtain a reduced output layer which efficiently compresses the reservoir information, maintaining the same network classification accuracy. Moreover, a weight quantisation strategy applied to the readout maps allows to maintain high performance, saving at the same time, computing resources in view of a hardware implementation. In fact, it is reasonable to imagine digital implementations of ESN-based systems which are requested to classify some input data. However, internal data representation arising from the structure of the network itself is commonly quite redundant and ill-posed (even numerically), due to the absence of general quantitative criteria for the selection of the network size. Moreover, it is often argued that, for many reasons, digital neural networks ought to be preferred for data classification (Kakkar, 0000). By introducing a proper dimensionality reduction algorithm, the obtained network presents lower complexity and can be implemented on digital devices through a quantisation process, leading to discrete integer weights. Our paper shows that if a nonlinear manifold reduction technique, namely Laplacian Eigenmaps (LEs) of the associated graph arising from the network state matrix, is applied, then quantisation requires a lower number of bits for representing all the trained weights than those needed for a double precision storage, allowing a strongly lower memory consumption. This is an interesting step for the overall system optimisation, which is mandatory in case of large neural networks and whenever there are some constraints on the design phase (for example in FPGA-based applications (Bonabi et al., 2014)). The use of manifold reduction has to be seen both as a mathematical tool for data processing and manipulation, conceived for lowering the number of degrees of freedom, and as a simpler representation of the original state space, devoted to the restraint of the so-called curse of dimensionality (Bellman, 2003). This paper exploits this method with particular attention to the problem of manipulating and/or processing -Boolean functions and binary classification, widely discussed in literature (Cagnoni et al., 2005). The network realised in this paper has been trained over all the possible -Boolean functions, supposing that each uniquely identified function corresponds to a specific action/decision/class. In Han and Xu (2018) LEs-based ESNs hyperparameters have been empirically selected; in our paper a constructive method for a data-driven hyperparameter determination targeted to a proper design of an LEs-based ESN has been introduced and the application has been focused to binary decision mappings. In Arena et al. (2019) the strategy has been applied to a liquid state-like classifier to solve two traditional classification tasks. This paper generalises the strategy to the mapping of any arbitrary -Boolean function considered as the archetype of any decision making task.
From a biological perspective, these approaches refer to the so-called neural reuse theory (Anderson, 2010) to indicate those cognitive processes involved when living beings learn new abilities (Anderson, 2016). RC and neural reuse paradigms have already been used jointly to represent neural dynamics in simple brains and lead to the concurrent emergence of multiple behaviours, from motor skill learning to sequence/sub-sequence learning (Patané et al., 2018). Neural reuse theory suggests that complex behaviours arising from brain processing emerge from the activation of smaller computational units that act modularly. Thereby, starting from a very complex nonlinear processing of the signals coming from the environment, even single “read-out” units can be responsible for choosing a complex behaviour or decision. From a computational perspective, the brain seems to adopt a top-down strategy for reducing the complexity of a task. Since reduction transforms a large network into a smaller one preserving the outputs somehow, it is possible to state that reduction applies the same strategy as neural reuse: essentially, reduction implies a reorganisation of the network. This further outlines the generality of the proposed approach which can be successfully extended to other neural network topologies.
The paper is organised as follows: Section 2 reports the basic information about ESNs and LEs algorithm for handling the linear separability problem over a -dimensional binary hypercube. Section 3 reports all the simulation results, the details of the strategy for tuning the LEs algorithm hyperparameters and the noise rejection capability of the overall system. In Section 4 an application of the method is presented referring to a typical example of wall following task in mobile robotics involving a decision making strategy. Some remarks about the proposed method, the design choices and the obtained results are reported in Section 5. Eventually, Section 6 concludes the paper.
Section snippets
Description of the Echo State Network model
ESNs are typically composed of three layers and the middle one constitutes the main computational core, working as a reservoir of neurons connected each other through random sparse and even recurrent connections, but other configurations like the one described in Malik et al. (2017) are allowed. Regardless of the employed structure, one of the fundamental characteristics of ESNs is the echo state property, which is a basic requirement for output-only training (i.e., an off-line approach
Simulations and results
This section reports both the description and the details of the simulations carried out by using MATLAB® in order to test our framework. Starting from the practical suggestions reported in Lukoševičius (2012) for implementing and simulating an ESN, our simulation parameters are reported in Table 2 and they will be justified in the next paragraphs. First, a brief discussion about how to design the dataset and the problem of encoding the set of expected target signals is required; then the
An application in mobile robotics: the wall-following navigation task
Dimensionality reduction and neural networks work fine with data drawn from real experiments too. In this section, further tests have been performed over a common dataset available in literature, in order to show how reliable the combined action of both LEs and ESNs is over grounded experiments.
An interesting issue in mobile robotics concerns the development of strategies for allowing a robot to safely move within environments in presence of obstacles. Generally speaking, these strategies ought
Discussion
Generally, ESNs are characterised by a significant level of randomness, due to both internal and input-to-reservoir connections. As discussed in the last paragraphs, this can be faced by evaluating statistically the overall behaviour of the system. In a previous paragraph we have already motivated our choice of selecting because such a Boolean function is surely nontrivial and does not require so much computational efforts for the sake of demonstration. What really matters is how to choose
Conclusion
In this paper, a framework for supervised classification of generic Boolean functions over -dimensional hypercubes employing Echo State Networks with Laplacian Eigenmaps for dimensionality reduction has been developed and discussed. In particular, our application has aimed to be a simple tool for solving binary classification by adopting ESNs as main computational core. The main focus, successfully reached in this work, was to demonstrate that a unique ESN reservoir layer is sufficient to
CRediT authorship contribution statement
Paolo Arena: Conceptualization, Methodology, Supervision. Luca Patanè: Conceptualization, Methodology, Software, Writing - review & editing. Angelo Giuseppe Spinosa: Conceptualization, Methodology, Software, Data curation, Writing - original draft, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (59)
- et al.
Data-based analysis of laplacian eigenmaps for manifold reduction in supervised liquid state classifiers
Inform. Sci.
(2019) - et al.
Forecasting energy consumption and wind power generation using deep echo state network
Renew. Energy
(2020) - et al.
Effective energy consumption forecasting using enhanced bagged echo state network
Energy
(2020) - et al.
Online learning control with echo state networks of an oil production platform
Eng. Appl. Artif. Intell.
(2019) - et al.
Recent advances in physical reservoir computing: A review
Neural Netw.
(2019) - et al.
A theoretical investigation of several model selection criteria for dimensionality reduction
Pattern Recognit. Lett.
(2012) - et al.
Re-visiting the echo state property
Neural Netw.
(2012) - et al.
Gmfllm: A general manifold framework unifying three classic models for dimensionality reduction
Eng. Appl. Artif. Intell.
(2017) Selected Papers of Hirotugu Akaike
(1998)The Classification of Boolean Functions using the Rademacher-Walsh Transform
(2007)
Neural reuse: A fundamental organizational principle of the brain
Behav. Brain Sci.
Neural reuse in the organization and development of the brain
Dev. Med. Child Neurol.
IEEE Standard for Floating-Point Arithmetic (IEEE Std 754-2008)
Laplacian eigenmaps for dimensionality reduction and data representation
Neural Comput.
Dynamic Programming
Fpga implementation of a biological neural network based on the hodgkin-huxley neuron model
Front. Neurosci.
Classification and Regression Trees
Digital Document Processing: Major Directions and Recent Advances
Evolving binary classifiers through parallel computation of multiple fitness cases
IEEE Trans. Syst. Man Cybern. B
Training linear discriminant analysis in linear time
Training a support vector machine in the primal
Neural Comput.
Partitioning nominal attributes in decision trees
Data Min. Knowl. Discov.
An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods
Multi-Modes Cascade Svms: Fast Support Vector Machines in Distributed System
Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data
Proc. Natl. Acad. Sci.
Working set selection using second order information for training svm
J. Mach. Learn. Res.
The use of multiple measurements in taxonomic problems
Ann. Eugen.
Short-term memory mechanisms in neural network learning of robot navigation tasks: A case study
Cited by (4)
Echo-state networks for soft sensor design in an SRU process
2021, Information SciencesCitation Excerpt :Related approaches have been recently proposed, demonstrating the potential advantages of ESNs when multiple hidden layers are considered, which have resulted in deep ESNs [5]. Manifold reduction techniques to improve the noise robustness, further applying the concept of neural reuse, were also exploited [2,3]. In this paper, we chose to adopt ESNs with a single hidden layer to improve the process identification with the introduction of an intrinsic plasticity (IP) rule [23,49].
A Systematic Review of Echo State Networks From Design to Application
2024, IEEE Transactions on Artificial IntelligenceSystematic investigation of keywords selection and processing strategy on search engine forecasting: a case of tourist volume in Beijing
2022, Information Technology and Tourism