Robust modelling of binary decisions in Laplacian Eigenmaps-based Echo State Networks

https://doi.org/10.1016/j.engappai.2020.103828Get rights and content

Abstract

This paper aims to present a framework for supervised binary classification of n-Boolean functions through Echo State Networks endowed with Laplacian Eigenmaps for dimensionality reduction. The proposed method is applied both to improve the classification performance when the learnt weights are quantised in view of a digital implementation and as a computational demonstration of the neural reuse theory when parallel outputs are allowed. Our analysis focuses on the effect of various forms of noise (i.e., normal noise, uniform noise and quantisation noise) when all the possible Boolean functions of n input bits are learnt. External disturbances are applied both over the learnt weights and the input features so that we can analyse how resilient the whole architecture is when various forms of parametric noise is injected into the system. Results presented here show that dimensionality reduction allowed by the Laplacian Eigenmaps-based approach improves robustness to these different sources of noise, leading to reduced memory storage requirements while maintaining high classification performance. Our results are compared to those derived from other more common classification techniques in terms of learning performance and computational complexity, also considering a realistic dataset describing a decision making task in a wall-following navigation session with mobile robots.

Introduction

In artificial neural networks theory, different strategies can be followed for solving data classification which is at the basis of any decision-making problem. Even though in literature many efficient algorithms have been introduced, bio-inspired architectures cannot discard that decision making in living beings is not the result of a static processing, but involves a complex spatial–temporal dynamical process from which decisions emerge. Reservoir Computing (RC) aims to be a simple but powerful set of concepts and techniques for defining neural networks that can capture temporal dependencies by both allowing recurrent connections in the hidden layer and preserving internal state between inputs. Echo State Networks (ESNs) constitute a preeminent example of architecture belonging to the RC paradigm (Lukoševičius et al., 2012) and play an important role in plenty of applications, ranging from handwriting recognition (Bunke and Varga, 2007), to online learning control (Jordanou et al., 2019), to Autonomous Surface Vehicles (ASV) cooperation control (Liu et al., 2017). As regards nonlinear dynamical systems modelling and time series prediction, ESNs were successfully applied, for instance, to energy consumption estimation (Hu et al., 2020b) and to wind power generation forecasting adopting deep versions of ESNs (Hu et al., 2020a). Applications of ESNs, able to extract and predict the nonlinear characteristics of time series, and acting concurrently to ARMA models, able to estimate the linear part, were dealt with in Tian et al. (2020). ESNs were also successfully applied to chaotic time series prediction (Tian, 0000), using novel optimisation algorithms for the network parameters. Literature provides a whole and deep list of the ESN advantages, among which we can mention: the simplicity of the training method, real-time processing, global optimality and others. More specifically in relation to our contribution, in Tanaka et al. (2019) it is clearly stated, among the advantages of classical ESNs, the high dimensionality representation: it is possible to map inputs into a high-dimensional space. This property facilitates the separation of originally inseparable inputs in classification tasks and allows reading out spatiotemporal dependencies of inputs in prediction tasks. The main contribution of our paper exploits and enhances this aspects: it consists in designing a ESN-based structure able to provide a large number of decision making tasks in parallel, significantly reducing the network memory requirements, by modifying the weight data representation and, at the same time, without loosing in performances thanks to the intrinsic robustness of the proposed method to quantisation noise. This challenging objective has been obtained adopting the reservoir-based processing paradigm, formalised in terms of an Echo-State-Network, that has been enriched with a series of structural advantages. The first one is to include multiple readout maps on a unique reservoir. In this way in the paper it was demonstrated that a unique ESN reservoir layer is sufficient to allow the representation of all the binary functions with a fixed number of bits. This paradigm is inspired by the concept of neural reuse in neural psychology. In addition, a structural reduction method is applied to the readout maps to obtain a reduced output layer which efficiently compresses the reservoir information, maintaining the same network classification accuracy. Moreover, a weight quantisation strategy applied to the readout maps allows to maintain high performance, saving at the same time, computing resources in view of a hardware implementation. In fact, it is reasonable to imagine digital implementations of ESN-based systems which are requested to classify some input data. However, internal data representation arising from the structure of the network itself is commonly quite redundant and ill-posed (even numerically), due to the absence of general quantitative criteria for the selection of the network size. Moreover, it is often argued that, for many reasons, digital neural networks ought to be preferred for data classification (Kakkar, 0000). By introducing a proper dimensionality reduction algorithm, the obtained network presents lower complexity and can be implemented on digital devices through a quantisation process, leading to discrete integer weights. Our paper shows that if a nonlinear manifold reduction technique, namely Laplacian Eigenmaps (LEs) of the associated graph arising from the network state matrix, is applied, then quantisation requires a lower number of bits for representing all the trained weights than those needed for a double precision storage, allowing a strongly lower memory consumption. This is an interesting step for the overall system optimisation, which is mandatory in case of large neural networks and whenever there are some constraints on the design phase (for example in FPGA-based applications (Bonabi et al., 2014)). The use of manifold reduction has to be seen both as a mathematical tool for data processing and manipulation, conceived for lowering the number of degrees of freedom, and as a simpler representation of the original state space, devoted to the restraint of the so-called curse of dimensionality (Bellman, 2003). This paper exploits this method with particular attention to the problem of manipulating and/or processing n-Boolean functions and binary classification, widely discussed in literature (Cagnoni et al., 2005). The network realised in this paper has been trained over all the possible n-Boolean functions, supposing that each uniquely identified function corresponds to a specific action/decision/class. In Han and Xu (2018) LEs-based ESNs hyperparameters have been empirically selected; in our paper a constructive method for a data-driven hyperparameter determination targeted to a proper design of an LEs-based ESN has been introduced and the application has been focused to binary decision mappings. In Arena et al. (2019) the strategy has been applied to a liquid state-like classifier to solve two traditional classification tasks. This paper generalises the strategy to the mapping of any arbitrary n-Boolean function considered as the archetype of any decision making task.

From a biological perspective, these approaches refer to the so-called neural reuse theory (Anderson, 2010) to indicate those cognitive processes involved when living beings learn new abilities (Anderson, 2016). RC and neural reuse paradigms have already been used jointly to represent neural dynamics in simple brains and lead to the concurrent emergence of multiple behaviours, from motor skill learning to sequence/sub-sequence learning (Patané et al., 2018). Neural reuse theory suggests that complex behaviours arising from brain processing emerge from the activation of smaller computational units that act modularly. Thereby, starting from a very complex nonlinear processing of the signals coming from the environment, even single “read-out” units can be responsible for choosing a complex behaviour or decision. From a computational perspective, the brain seems to adopt a top-down strategy for reducing the complexity of a task. Since reduction transforms a large network into a smaller one preserving the outputs somehow, it is possible to state that reduction applies the same strategy as neural reuse: essentially, reduction implies a reorganisation of the network. This further outlines the generality of the proposed approach which can be successfully extended to other neural network topologies.

The paper is organised as follows: Section 2 reports the basic information about ESNs and LEs algorithm for handling the linear separability problem over a n-dimensional binary hypercube. Section 3 reports all the simulation results, the details of the strategy for tuning the LEs algorithm hyperparameters and the noise rejection capability of the overall system. In Section 4 an application of the method is presented referring to a typical example of wall following task in mobile robotics involving a decision making strategy. Some remarks about the proposed method, the design choices and the obtained results are reported in Section 5. Eventually, Section 6 concludes the paper.

Section snippets

Description of the Echo State Network model

ESNs are typically composed of three layers and the middle one constitutes the main computational core, working as a reservoir of neurons connected each other through random sparse and even recurrent connections, but other configurations like the one described in Malik et al. (2017) are allowed. Regardless of the employed structure, one of the fundamental characteristics of ESNs is the echo state property, which is a basic requirement for output-only training (i.e., an off-line approach

Simulations and results

This section reports both the description and the details of the simulations carried out by using MATLAB® in order to test our framework. Starting from the practical suggestions reported in Lukoševičius (2012) for implementing and simulating an ESN, our simulation parameters are reported in Table 2 and they will be justified in the next paragraphs. First, a brief discussion about how to design the dataset and the problem of encoding the set of expected target signals is required; then the

An application in mobile robotics: the wall-following navigation task

Dimensionality reduction and neural networks work fine with data drawn from real experiments too. In this section, further tests have been performed over a common dataset available in literature, in order to show how reliable the combined action of both LEs and ESNs is over grounded experiments.

An interesting issue in mobile robotics concerns the development of strategies for allowing a robot to safely move within environments in presence of obstacles. Generally speaking, these strategies ought

Discussion

Generally, ESNs are characterised by a significant level of randomness, due to both internal and input-to-reservoir connections. As discussed in the last paragraphs, this can be faced by evaluating statistically the overall behaviour of the system. In a previous paragraph we have already motivated our choice of selecting n=3 because such a Boolean function is surely nontrivial and does not require so much computational efforts for the sake of demonstration. What really matters is how to choose

Conclusion

In this paper, a framework for supervised classification of generic Boolean functions over n-dimensional hypercubes employing Echo State Networks with Laplacian Eigenmaps for dimensionality reduction has been developed and discussed. In particular, our application has aimed to be a simple tool for solving binary classification by adopting ESNs as main computational core. The main focus, successfully reached in this work, was to demonstrate that a unique ESN reservoir layer is sufficient to

CRediT authorship contribution statement

Paolo Arena: Conceptualization, Methodology, Supervision. Luca Patanè: Conceptualization, Methodology, Software, Writing - review & editing. Angelo Giuseppe Spinosa: Conceptualization, Methodology, Software, Data curation, Writing - original draft, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (59)

  • AndersonM.

    Neural reuse: A fundamental organizational principle of the brain

    Behav. Brain Sci.

    (2010)
  • AndersonM.

    Neural reuse in the organization and development of the brain

    Dev. Med. Child Neurol.

    (2016)
  • Anon

    IEEE Standard for Floating-Point Arithmetic (IEEE Std 754-2008)

    (2008)
  • BelkinM. et al.

    Laplacian eigenmaps for dimensionality reduction and data representation

    Neural Comput.

    (2003)
  • BellmanR.

    Dynamic Programming

    (2003)
  • BonabiS.Y. et al.

    Fpga implementation of a biological neural network based on the hodgkin-huxley neuron model

    Front. Neurosci.

    (2014)
  • BreimanL. et al.

    Classification and Regression Trees

    (1984)
  • BunkeH. et al.

    Digital Document Processing: Major Directions and Recent Advances

    (2007)
  • CagnoniS. et al.

    Evolving binary classifiers through parallel computation of multiple fitness cases

    IEEE Trans. Syst. Man Cybern. B

    (2005)
  • CaiD. et al.

    Training linear discriminant analysis in linear time

  • ChapelleO.

    Training a support vector machine in the primal

    Neural Comput.

    (2007)
  • Cheng, L., Cho, H., Yoon, P., 2014. Gpu accelerated vessel segmentation using laplacian eigenmaps. in: Proceedings of...
  • CoppersmithD. et al.

    Partitioning nominal attributes in decision trees

    Data Min. Knowl. Discov.

    (1999)
  • CristianiniN. et al.

    An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods

    (2000)
  • CuiL. et al.

    Multi-Modes Cascade Svms: Fast Support Vector Machines in Distributed System

    (2017)
  • DonohoD. et al.

    Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data

    Proc. Natl. Acad. Sci.

    (2003)
  • FanR. et al.

    Working set selection using second order information for training svm

    J. Mach. Learn. Res.

    (2005)
  • FisherR.A.

    The use of multiple measurements in taxonomic problems

    Ann. Eugen.

    (1936)
  • FreireA.L. et al.

    Short-term memory mechanisms in neural network learning of robot navigation tasks: A case study

  • Cited by (4)

    • Echo-state networks for soft sensor design in an SRU process

      2021, Information Sciences
      Citation Excerpt :

      Related approaches have been recently proposed, demonstrating the potential advantages of ESNs when multiple hidden layers are considered, which have resulted in deep ESNs [5]. Manifold reduction techniques to improve the noise robustness, further applying the concept of neural reuse, were also exploited [2,3]. In this paper, we chose to adopt ESNs with a single hidden layer to improve the process identification with the introduction of an intrinsic plasticity (IP) rule [23,49].

    • A Systematic Review of Echo State Networks From Design to Application

      2024, IEEE Transactions on Artificial Intelligence
    View full text