An extension of the type-1 and singleton fuzzy logic system trained by scaled conjugate gradient methods for multiclass classification problems

doi:10.1016/j.neucom.2020.05.052

Neurocomputing

Volume 411, 21 October 2020, Pages 149-163

https://doi.org/10.1016/j.neucom.2020.05.052 Get rights and content

Highlights

•
This paper presents an extension of type-1 and singleton fuzzy logic systems that enables one to handle multiclass classification problems avoiding the use of binary decomposition strategy.
•
We provide the equations to compute the gradient of the proposed fuzzy, facilitating the use of any first-order training method presented in the literature.
•
We provide the equations to calculate precisely the multiplication of the Hessian matrix by a directional vector, facilitating the use of second-order information methods and avoiding the costly computation of the full Hessian matrix (i.e., Hessian-free). Also, we offer a simple form to compute the multiplication of the Hessian matrix by a directional vector using only matrix operations.
•
We present a comparison between the proposed approach and a model previously shown in the literature using the steepest descent, scaled conjugate gradient, and scaled conjugate gradient using the differential operator R.
•
Based on datasets provided by UCI Machine Learning Repository, we present several performance analyses, in terms of accuracy, mean squared error, convergence speed, number of fuzzy rules, and well-established classification metrics.

Abstract

This paper proposes an extension of the type-1 and singleton fuzzy logic system for dealing with multiclass classification problems. The proposed extension enables a fuzzy classifier to generate more than one output, thereby avoiding the use of binary decomposition strategies when multiclass classification problems are considered. Additionally, with the goal of improving classifier performance, the scaled conjugate gradient training method was applied, as well as its modified version using the differential operator $R \{\cdot\}$ . The effectiveness of the proposed extension was evaluated using data from the UCI Machine Learning Repository based on well-established classification metrics. The numerical results reveal a significant reduction in computational complexity when using the proposed extension compared to the traditional decomposition strategy, as well as improved convergence speed when using the scaled conjugate gradient training method.

Introduction

Multiclass classification problems (MCPs) represent a principal branch in the machine learning field and are formulated as the discrimination of patterns in more than two classes (discrimination between only two classes is referred to as binary classification). Such problems exist in numerous research fields, including the biomedical [1], surveillance and security [2], computer vision [3], [4], [5], aeronautical [6], and industrial [7] fields.

In general, models for solving MCPs can be divided into two categories: the binary classifier (BCs) (i.e., classifiers with a single output) and multiclass classifier (i.e., classifiers with multiple outputs). A BC can solve MCPs using various strategies. The simplest and most commonly applied strategies are called one-versus-one (OvO) [8] and one-versus-all (OvA) [9]. The OvO strategy forms pairs of classes by decomposing the original set of classes into several binary subsets of classes, requiring the use of $ϒ (ϒ - 1) 2$ classifiers, where $ϒ \in N^{*}$ is the number of classes in the problem. The OvO strategy utilizes a decision stage that predicts classes using voting scheme to incorporate the results of all classifiers, where the class with the highest number of votes is chosen. In the OvA strategy, a classification problem is decomposed into $ϒ$ binary problems, where each problem is formulated to distinguish one class from all other classes. Several studies have shown that the OvO strategy is superior to the OvA strategy [10], [11]. However, this advantage is attenuated to some extent because increased computational costs are associated with an increasing number of classes.

Regarding MCPs solved by fuzzy logic systems (FLSs) belonging to the first group of models, it is necessary to implement decomposition strategies because FLSs only generates single outputs. When using such strategies, researchers have largely attempted to improve the performance of FLSs by focusing on improved training methods. The authors of [12] proposed an adaptation of the inference method in a fuzzy association rule-based classification model with the goal of producing more accurate aggregations to improve the classification performance using the OvO and OvA strategies. The authors of [13] implemented an FLS-based classification technique with a binary decomposition strategy (i.e., OvO and OvA) to handle MCPs. They attempted to identify optimal methods for the decision stage of classification. Additionally, the authors of [14] improved classification performance for MCPs by using the conjugate gradient (CG) training method, which incorporates the second-order information to train an FLS. However, these methods require large numbers of fuzzy rules to implement binary decomposition strategies. To avoid this issue, it is necessary to design an FLS with multiple outputs.

In an attempt to overcome the limitation of FLSs for handling MCPs (i.e., the fuzzy classifier has one output and needs to implement a binary decomposition strategy), we propose an extension of type-1 and singleton FLSs (T1-FLSs) called T1-FLS with multiple outputs (T1-FLSMO). The most important aspect of T1-FLSMO is that it can handle MCPs using only a single classifier. Consequently, it can significantly reduce the computational complexity of the FLS-based classifiers compared to the use of OvO or OvA strategies. Furthermore, T1-FLSMO was developed based on the FLS presented in [15], where we introduced the scaled CG (SCG) and SCG using the differential operator $R \{\cdot\}$ (SCGR) for training the FLS, which use second-order information without computing a Hessian matrix. Therefore, the equations for computing the exact value of the Hessian matrix by directional vector $(Hv)$ using the differential operator $R \{\cdot\}$ for T1-FLSMO are designed to facilitate use of the SCGR training method.

The main contributions of this work can be summarized as follows:

•
An extension of T1-FLS to facilitate multiple outputs is proposed, enabling it to handle multiclass classification problems and avoid binary decomposition strategies, reducing computational complexity;
•
The deduction of equations to calculate the gradient for T1-FLSMO is presented, facilitating the use of first-order information training methods available in the literature [14], [15];
•
The deduction of the equations to compute the exact value of $Hv$ using the differential operator $R \{\cdot\}$ for T1-FLSMO is presented, facilitating to use second-order information training methods;
•
A comparison of T1-FLS and T1-FLSMO using the steepest descent (SD) and SCG training methods is presented. Additionally, the SCG and SCGR training methods are implemented for T1-FLSMO, reducing the dependence of heuristic parameters choices present in training methods [15], thereby increasing the performance of the proposed classifier when a limited number of epochs is considered. Moreover, the SCG methods avoid the computation of a full Hessian matrix, which reduces computational complexity when second-order information training methods are adopted;
•
Performance analyses in terms of accuracy, mean squared error (MSE), convergence speed, number of fuzzy rules, and well-established classification metrics are presented based on datasets provided by UCI Machine Learning Repository [16]. We also present comparative results for the proposed extension and T1-FLS using OvA decomposition strategy;
•
A simple form to compute $Hv$ using only matrix operations is presented to clarify the implementation of the SCGR training method for T1-FLSMO or T1-FLS.

The remainder of this paper is organized as follows. Section 2 presents the problem formulation. Section 3 addresses T1-FLSMO, as well as the SCG and SCGR training methods. Section 4 focuses on the analyses of experimental results. Section 5 states our main conclusions regarding the proposed extension; Appendix A outlines the deduction of $Hv$ using the differential operator $R \{\cdot\}$ and Appendix B presents the computation of $Hv$ using the differential operator $R \{\cdot\}$ based solely on matrix operations for T1-FLSMO.

Section snippets

Problem statement

A classification problem can be formulated as mapping between a vector $x \in R^{P \times 1}$ and a label, where P is the number of features $P \in N^{*}$ . The input space is divided into “decision regions”, whose boundaries are called “decision boundaries” or “decision surfaces” [17]. In this context, each decision region is assigned to one class. An FLS classifier maps an input vector to decision regions by using IF-THEN rules. When adopting a T1-FLS BC for handling classification problem proposed in [18] and

The proposed type-1 singleton fuzzy logic system multiple output

The proposed extension of T1-FLS is called T1-FLSMO and its inference block is presented in Fig. 1.

The multiple outputs of T1-FLSMO are related to deffuzifier block, as shown in Fig. 1. One can increase number of outputs of T1-FLS using height defuzzification, which replaces each rule in an output fuzzy set with a singleton ( $θ_{l}$ ) at the point having the maximum membership value in that output set. Next, by calculating the centroid of the T1 set comprised of these singletons, one can create a

Experimental results

The performance analyses presented in this section are based on datasets provided by UCI Machine Learning Repository [16]. Information regarding these datasets is presented in Table 1. For the sake of comparison, we implemented the T1-FLS using the OvA decomposition strategy and investigated training using the SD (SD T1-FLSOvA) and SCG (SCG T1-FLSOvA) based on the algorithm presented in Fig. 2. A T1-FLS trained using the SCGR method was not be implemented in this study because the results

Conclusion

In this paper, we introduced the T1-FLSMO model, which facilitates use of the T1-FLS for handling MCP without any decomposition strategy. To improve T1-FLSMO performance, we implemented the SCG and SCGR training methods, which respectively approximate and compute the multiplication of a Hessian matrix by directional vector (i.e., $Hv$ ). Furthermore, deductions of the equations for computing $Hv$ exactly using the differential operator $R \{\cdot\}$ were provided.

Numerical results demonstrated that T1-FLSMO

CRediT authorship contribution statement

Renan P. Finotti Amaral: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing - original draft. Ivan F.M. Menezes: Resources, Writing - review & editing, Supervision. Moisés V. Ribeiro: Validation, Formal analysis, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was partially supported by the National Council for Scientific and Technological Development (CNPq – Conselho Nacional de Desenvolvimento Cientfico e Tecnolgico – Brazil). It was also financed by the Coordenao de Aperfeioamento de Pessoal de Nvel Superior – Brasil (CAPES) – Finance Code 001. The authors acknowledge the support provided by the Tecgraf Institute of Technical-Scientific Software Development of PUC-Rio (Tecgraf/PUC-Rio), Brazil. Any opinions, findings, conclusions, or

Renan P. Finotti Amaral received the B.Sc. degree in Mechanical Engineering from Federal University of Juiz de Fora (UFJF), Brazil, in 2015 and M.Sc. in Electrical Engineering from Federal University of Juiz de Fora (UFJF), Brazil, in 2017. He is a PhD student in Mechanical Engineering at Pontifical Catholic University of Rio de Janeiro (PUC-RIO), with the scholarship granted by CNPq (National Council for Scientific and Technological Development). His research interests include Thermosciences,

References (27)

A. Fernândez et al.
Solving multi-class problems with linguistic fuzzy rule based classification systems based on pairwise learning and preference relations
Fuzzy Sets Syst.
(2010)
R.P.F. Amaral et al.
Type-1 and singleton fuzzy logic system trained by a fast scaled conjugate gradient methods for dealing with binary classification problems
Neurocomputing
(2019)
M.F. Møller
A scaled conjugate gradient algorithm for fast supervised learning
Neural Netw.
(1993)
P. Levinger et al.
The application of multiclass svm to the detection of knee pathologies using kinetic data: a preliminary study
R.V. Sharan et al.
Comparison of multiclass svm classification techniques in an audio surveillance application under mismatched conditions
Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, K. Yu, L. Cao, T. Huang, Large-scale image classification: fast feature...
B. Zhao, F. Li, E. P. Xing, Large-scale category structure aware image categorization, in: Advances in Neural...
Z. Akata et al.
Good practice in large-scale learning for image classification
IEEE Trans. Pattern Anal. Mach. Intell.
(2014)
P.H.S. Calderano, M.G.C. Ribeiro, R.P.F. Amaral, M.M.B.R. Vellasco, R. Tanscheit, E.P. de Aguiar, An enhanced aircraft...
R.A. Campos, R.P.F. Amaral, N. Soares, L.G. da Fonseca, M.L. Lagares Júnior, E.P. de Aguiar, A new model to distinguish...

S. Knerr, L. Personnaz, G. Dreyfus, Single-layer learning revisited: a stepwise procedure for building and training a...

R. Anand et al.

Efficient classification for multiclass problems using modular neural networks

IEEE Trans. Neural Netw.

(1995)

E.L. Allwein et al.

Reducing multiclass to binary: a unifying approach for margin classifiers

J. Mach. Learn. Res.

(2000)

Cited by (4)

Hybrid simplification algorithm for unorganized point cloud based on two-level fuzzy decision making
2023, Optik
Traditional point cloud simplification algorithms have specific application scenarios. If these well-known methods can be accurately applied to different feature regions of a point cloud, high-quality point cloud feature preservation can be realized. In this paper, a hybrid point cloud simplification method, based on two-level fuzzy decision making, is proposed. Specifically, the number of peak points and bin intervals of each dimension density histogram of the point cloud are counted, and the cluster number and initial centers of fuzzy c-means (FCM) clustering method are determined, hence the autonomous FCM clustering method of point clouds is realized. Based on the expert linguistic inference rules of type-1 fuzzy system, its output is used to accurately identify clusters as flat, transitional and drastic type. According to the attributes of the clusters, predefined thresholds or algorithms are applied in each one, achieving high quality point cloud simplification. The experimental results show that the algorithm can effectively determine the number of clusters, improve the speed of FCM clustering algorithm and achieve feature preserving point cloud simplification, after removing 50–90 % of the points.
Energy reliability in macro base stations: A feasible solution based on a type-1 Mamdani fuzzy system
2021, Electric Power Systems Research
Citation Excerpt :
Also, this fuzzy system relies on the simplicity of its implementation and updating in the processor of the DataConcentrator device. The idea behind multiple-output fuzzy system was formerly introduced in [40]. In this contribution, it is investigated together with the Center of Gravity defuzzifier and specialist-based rule design to avoid the necessity of using the training procedure suggested in [40].
Aiming to improve energy reliability in base stations (on- and off-grid), this paper focuses on the benefits of a feasible solution that applies a multiple-output singleton type-1 Mamdani fuzzy system to monitor and manage existing energy sources. It also shows how useful the reliability function is for quantifying energy reliability in base stations under the stochastic availability of the existing sources of energy. In this sense, it pays attention to a macro base station that can be feed by a power utility, wind source, solar source, and diesel generator. Numerical results show that using the chosen fuzzy system results in a simple and effective technique for switching among the available sources of energy.
Type-1 and singleton fuzzy logic system binary classifier trained by BFGS optimization method
2023, Fuzzy Optimization and Decision Making
Evaluation of the compressive strength and Cl<sup>−</sup> content of the blast furnace slag-soda sludge-based cementitious material using machine-learning approaches
2022, Clean Technologies and Environmental Policy

Ivan F. M. Menezes received the B.Sc. degree in Civil Engineering from Federal University of Pernambuco (UFPE), Brazil, in 1986, M.Sc. in Civil Engineering from Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Brazil, in 1990, and D.Sc. in Civil Engineering from Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Brazil, in 1995. He is an Assistant Professor in the Mechanical Engineering Department at Pontifical Catholic University of Rio de Janeiro (PUC Rio). His research interests include Computational Mechanics, Structural Optimization, and Numerical Analysis.

Moisés V. Ribeiro received the B.Sc. degree in Electrical Engineering from the Federal University of Juiz de Fora (UFJF), MG, Brazil, and M.Sc. and D.Sc. degrees in Electrical Engineering from the University of Campinas, SP, Brazil, in 1999, 2001 and 2005, respectively. He was a Visiting Scholar at University of California in Santa Barbara, CA, USA, in 2004, Visiting Professor (2005-2007) and Assistant Professor (2007-2015) at UFJF. Since 2015, he has been an Associate Professor at UFJF. He co-founded Smarti9 LTD. and Wari LTD. in 2012 and 2015, respectively. His research interests includes signal processing, digital communication, power line communication, smart grids, internet of things and smart city. In these fields, he has authored over 172 peer reviewed papers, 9 book chapters, and filed 13 patents. He was the General Chair of the 2010 IEEE ISPLC, 2013 IWSGC, SBrT 2015, and a Guest Co-Editor for Special Issues in the EURASIP Journal on Advances in Signal Processing and EURASIP Journal of Electrical and Computer Engineering. He had served as the Secretary of the IEEE ComSoc TC-PLC. He was the recipient of Fulbright Visiting Professorship at Stanford University, Stanford, CA, USA, in 2011, and at Princeton University, Princeton, NJ, USA, in 2012. He was awarded Student Awards from 2001 IEEE IECON and 2003 IEEE ISIE.

View full text