A pareto ensemble based spectral clustering framework

Luo, Juanjuan; Ma, Huadong; Zhou, Dongqing

doi:10.1007/s40747-020-00215-7

A pareto ensemble based spectral clustering framework

Original Article
Open access
Published: 02 November 2020

Volume 7, pages 495–509, (2021)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

A pareto ensemble based spectral clustering framework

Download PDF

1138 Accesses
2 Citations
Explore all metrics

Abstract

Similarity matrix has a significant effect on the performance of the spectral clustering, and how to determine the neighborhood in the similarity matrix effectively is one of its main difficulties. In this paper, a “divide and conquer” strategy is proposed to model the similarity matrix construction task by adopting Multiobjective evolutionary algorithm (MOEA). The whole procedure is divided into two phases, phase I aims to determine the nonzero entries of the similarity matrix, and Phase II aims to determine the value of the nonzero entries of the similarity matrix. In phase I, the main contribution is that we model the task as a biobjective dynamic optimization problem, which optimizes the diversity and the similarity at the same time. It makes each individual determine one nonzero entry for each sample, and the encoding length decreases to O(N) in contrast with the non-ensemble multiobjective spectral clustering. In addition, a specific initialization operator and diversity preservation strategy are proposed during this phase. In phase II, three ensemble strategies are designed to determine the value of the nonzero value of the similarity matrix. Furthermore, this Pareto ensemble framework is extended to semi-supervised clustering by transforming the semi-supervised information to constraints. In contrast with the previous multiobjective evolutionary-based spectral clustering algorithms, the proposed Pareto ensemble-based framework makes a balance between time cost and the clustering accuracy, which is demonstrated in the experiments section.

Adaptive population structure learning in evolutionary multi-objective optimization

Article 22 November 2019

Shuai Wang, Hu Zhang, … Aimin Zhou

An evolutionary many-objective algorithm based on decomposition and hierarchical clustering selection

Article 30 October 2021

Yuehong Sun, Kelian Xiao, … Qiuyue Lv

An effective multiobjective approach for hard partitional clustering

Article 06 January 2015

Jay Prakash & P. K. Singh

Introduction

Spectral clustering has become one of the most popular researching fields in the last decades, it has shown impressive results in the practical applications [19, 20, 28]. The main idea of spectral clustering is to cluster the data into different groups by using the spectrum of the similarity matrix which captures the structure of the data [33]. Hence, the first step is to construct a similarity matrix, which plays an important role in its performance. Therefore, in this paper, we introduce multiobjective evolutionary algorithm into spectral clustering to construct the similarity matrix, and propose a Pareto ensemble-based spectral clustering framework (PESC).

Conventional similarity matrix construction

In recent years, many methods, including traditional methods [44] and other methods [2, 20, 29, 48], have been proposed to build an appropriate similarity matrix. The similarity matrix aims to model the relationship of each sample of a given dataset. Consider a dataset $\varvec{A}=\{\varvec{a}_1,..., \varvec{a}_i,..., \varvec{a}_N\}$ with N samples, $\varvec{S}\in {\mathbb {R}}^{N\times N}$ is the similarity matrix, where $s_{ij}$ measures the similarity between sample $\varvec{a}_i$ and $\varvec{a}_j$. Take the neighborhood of a sample $\varvec{a}_i$ in $\varvec{S}$ into consideration, the traditional methods can be summarized as follows [20]:

1.
Fully connected similarity matrix [12, 44]. This similarity construction method takes each sample as the neighbor of the other samples, and a nonzero similarity value is assigned to $s_{ij}$, resulting a full matrix.
2.
k-nearest neighbor (kNN) similarity matrix [4, 21]. For a sample $\varvec{a}_i$, all the similarity entries $s_{ij}\quad (j=1,...,N)$ are set to 0 except its nearest k neighbors. But the similarity matrix constructed by this method is not symmetric, there are two strategies to solve it. The first one is to set $s_{ij}=s_{ji}\ne 0$ when sample $\varvec{a}_i$ belongs to sample $\varvec{a}_j$’s k-nearest neighbors or sample $\varvec{a}_j$ belongs to sample $\varvec{a}_i$’s k-nearest neighbors. Another way is mutual k-nearest neighbor similarity construction method, it establishes $s_{ij}=s_{ji}\ne 0$ when both sample $\varvec{a}_i$ and sample $\varvec{a}_j$ belong to each other’s k-nearest neighbors.
3.
$\epsilon $-neighborhood similarity matrix [12, 44]. If the distance between the sample $\varvec{a}_i$ and $\varvec{a}_j$ is less than a threshold $\epsilon $, then $s_{ij}=1$, otherwise $s_{ij}=0$ .

In these methods, the Gaussian Kernel function $K(\varvec{a}_i, \varvec{a}_j)=\mathrm{exp}(\frac{-\mathrm{dist}_{ij}^2}{2\sigma ^2})$ is a typical measurement to calculate the value of nonzero similarity entry $s_{ij}$. In this Kernel function, $\mathrm{dist}_{ij}$ represents the distance between sample $\varvec{a}_i$ and $\varvec{a}_j$ (usually the Euclidean distance is adopted), $\sigma $ is the scale factor to control the width of the neighborhood which plays an important part in inducing meaningful neighborhood structure. Both k-nearest neighbor similarity matrix and $\epsilon $-neighborhood similarity matrix are sparse matrices, resulting that they have a strict demand for the value of parameter k or $\epsilon $. But it is a difficult task to determine the value of these parameters. In recent years, there are some approaches that learn a similarity graph with exactly K (the number of clusters) connected components for graph-based clustering, such as [30, 31]. To get this K connected similarity graph, [31] proposed a method that constructing an initial graph with k neighbors and calculating the value of these nonzero entries with a parameter-free approach, however, the value of k still needs to setup in this literature. In both [30, 31], a common question is that they should solve the optimization problem with the following form: $\min _{\varvec{x}} f(\varvec{x})+\gamma g(\varvec{x})$, which $\gamma $ is a parameter sensitive to a different dataset.

The multiobjective evolutionary algorithm based clustering

In recent years, multiobjective evolutionary algorithms (MOEAs) have attracted a lot of researchers’ attention for their wide application in real world [5, 7]. Many state-of-the-art MOEAs, such as NSGA-II [8], SPEA2 [53], IMOEA [41], and MOEA/D [15, 51], have been proposed to handle multiobjective optimization problems (MOPs). They also have been used in the field of data mining successfully, such as clustering [10, 18], classification [11, 25] and feature selection [27, 45]. MOEA-based clustering algorithms focus on using multiple criterions, such as the cluster validity indexes, to capture the characteristics of the data [16, 26]. They vary from different aspects, including the type of MOEAs, the encoding schemes, the objective functions, the final Pareto optimal solution selection strategies, and even the evolutionary operators. A detailed survey on this issue has been discussed in the literature [26]. To better understand the proposed framework PESC, we will present some related work in the aspect of the encoding schemes, the objective functions, and the final Pareto optimal solution selection strategies.

Considering the adopted encoding schemes, an experimental evaluation of cluster representations (prototype-based representation, label-based representation, and graph-based representation) for multi-objective evolutionary clustering has been done in [9]. The advantage of prototype-based representation is that the encoding length is small so that it takes less time to apply the evolutionary operators, nevertheless, this representation scheme is good at dealing with round-shaped data. On the contrary, label-based representation and graph-based representation are not strict to the underlying structure of the clusters, but their coding length is high especially for large-scale dataset.

The cluster validity indexes are usually adopted as the objective functions in MOEAs. The cluster validity indexes, such as overall cluster deviation [14], cluster connectedness [14], Jm [47], XB [50], silhouette index [37], the intracluster entropy H [35, 36], and cluster separation Sep(C) [35, 36], are the most commonly used objective functions. We have known that the quality of the clustering quite depends on the objective functions adopted, but there is no available analysis on the effect of different objective function until now.

The final Pareto optimal solution selection strategy can be classified into three categories: the independent objective-based approach, the knee-based approach, and the ensemble-based approach. For the first approach, another cluster validity different from objective functions is used to measure the quality of the nondominated solutions. The one with the best quality on this index is selected as the final solution, such approaches can be seen in [22, 23]. The knee-based approach aims to select a solution, which the change of one objective value induces the maximum change in the other one, from the Pareto Front (PF) as the final solution, such as MOCK [13, 14], StEMO [17].

In this paper, we focus on the ensemble-based final Pareto optimal solution selection strategy. Ensemble learning has been proved to be effective against solving machine learning problems, especially in the practical application, such as SAR image segmentation [52]. As for ensemble-based clustering, there is no explicit correspondence between the labels delivered by different clustering. Its difficulty lies in finding a consensus partition from multiple algorithms or partitions [42]. In [40], three cluster ensemble algorithms, which are cluster-based similarity partitioning algorithm (CSPA), hypergraph partitioning algorithm (HGPA), and meta-clustering algorithm (MCLA), were introduced to complete the ensemble task. Since MOEA can generate a set of nondominated solutions, it provides a suitable way for ensemble learning. The ensemble-based approach assumes that all the nondominated solutions contain some useful information on detecting the structure of the dataset, therefore, all these solutions should be integrated to obtain a single clustering output. The three cluster-ensemble approaches CSPA, HGPA, and MCLA can be easily applied to MOEA clustering problems. A few researchers have also proposed some MOEA-based cluster-ensemble algorithms. In [24], a multiobjective genetic algorithm-based approach for fuzzy clustering of categorical data which simultaneously optimizes fuzzy compactness and fuzzy separation of the clusters ($MOGA(\pi ,sep)$) is proposed. To obtain the final clustering result, a majority voting strategy is implemented on the nondominated solutions to select the training samples. In practical application, the combination of knee-based approach and ensemble-based approach for recurrent neural networks is successfully used in the prediction of computational fluid dynamic simulations [39] and image identification [1]. In these literatures, not all the individuals in the Pareto set are considered as suitable solutions, only the Pareto-optimal solutions around the knee-point are employed to implement the ensemble task. Besides that, MOEA is also applied to neural network ensembles successfully in [3] by simply combining all the classifiers obtained from MOEA to form the ensemble. In general, all the works have shown that the ensemble-based methods performs better than simply selecting one Pareto solution.

Motivation and contributions

As we know, the existed MOEA-based clustering algorithms are barely related to spectral clustering, except for CSPA. In our previous work [20], we have proposed a sparse representation based spectral clustering framework via MOEAs (denoted as SRMOSC), this work introduces multiobjective optimization into spectral clustering, and constructs the similarity matrix using a sparse representation approach by modeling spectral clustering as a constrained multiobjective optimization problem. Unfortunately, as we mentioned in that work, its space complexity is high, especially when solving large-scale problems. In addition, as mentioned in the above sections, the conventional similarity matrix construction methods usually suffers from parameter tuning difficulty. Motivated by these two aspects, we tried to tackle the above problems in this paper.

It is generally known that MOEA can generate a set of nondominated solutions for multiobjective optimization problem (MOP). Hence, if the similarity matrix construction problem can be regarded as a MOP and all the nondominated solution can participate in the construction of the similarity matrix, then this problem can be solved in a parallel way and it will be more time saving than the unparallel way. Inspired by this idea, we propose a Pareto ensemble based spectral clustering framework in this paper, whose main procedure falls into two phases, the main contributions are summarized as follows:

1.
A “divide and conquer” strategy is proposed to solve the similarity matrix construction problem. In PESC, the main procedure is divided into two phases, in which phase I aims to detect the nonzero entries of the similarity matrix by using MOEA and phase II turns to determine the value of the nonzero entries of the similarity matrix by the ensemble strategies.
2.
In phase I, we introduce dynamic multiobjective optimization to spectral clustering by adopting a similarity measurement and a specific designed diversity measurement as objective functions. In addition, a specific initialization is designed to speed up the convergence in this phase.
3.
In phase II, three ensemble approaches are proposed to construct the similarity matrix and determine the value of the nonzero entries.

In contrast to the existed MOEA based spectral clustering, PESC makes a balance between the time cost and the clustering accuracy with the “divide and conquer” strategy. Compared with the conventional similarity matrix construction methods, PESC can automatically determine the neighbors of the similarity matrix. It should be noted that the contribution of this work that differs from other Pareto-ensemble algorithms are that a single estimated Pareto-optimal solution can not construct a similarity matrix in this paper. What’s more, using the Pareto-ensemble based framework to construct the similarity matrix results in a reduction of both the time and space complexity.

To give a clear introduction and analysis of PESC, the paper is structured as follows. The description of PESC is shown in “The description of PESC”, including its two main phases. Then the experiment results and analysis are presented in “Experiments and discussion”. At last, a conclusion remark of PESC is given in “Conclusion”.

The description of PESC

In spectral clustering, the key point is to construct a similarity matrix. Assume $\varvec{A}=\{\varvec{a}_1, \varvec{a}_2,...,\varvec{a}_N\} $ is the given dataset with N samples and K clusters, $\varvec{S} = \left( {\begin{array}{*{20}{c}} {{s_{11}}}&{} \ldots &{}{{s_{1N}}}\\ \vdots &{} \ddots &{} \vdots \\ {{s_{N1}}}&{} \cdots &{}{{s_{NN}}} \end{array}} \right) $ is the similarity matrix to be constructed, then the main procedure of the spectral clustering is as Algorithm 1.

PESC aims to construct the similarity matrix $\varvec{S}$ with a Pareto ensemble approach, and we give a detailed description of it in this section. The main procedures (see Algorithm 2) are divided into two phases: phase I–the nonzero entries determination phase (Steps1–2) and phase II–Pareto ensemble based weight determination phase (Step3). Hence, this section will be divided into two sections to discuss each issue.

Phase I: Nonzero entries determination

Phase I consists of the initialization and the cycling loop of PESC. We distribute this phase into three sections to describe, including the mathematical description of PESC and the evolutionary operators.

Mathematical description of PESC

The proposed algorithm PESC takes advantage of the superiority of MOEA, which can generate a set of solutions, to find the possible nonzero entries of the similarity matrix. Assume that each individual can find one possible nonzero entry for each sample, the objective functions can be formulated as formulas (1) and (2):

$$\begin{aligned}&\min \;\left\{ {\begin{array}{*{20}{l}} {{f_1(\varvec{X})} = \frac{1}{N}\sum \limits _{i = 1}^N {\mathrm{sim}({\varvec{a}_i},{\varvec{a}_{{x_i}}})} }\\ {{f_2(\varvec{X})} = 1-\mathrm{DIV}(\varvec{X}) } \end{array}} \right. \nonumber \\&\mathrm{s.t.} \quad \varvec{X} = \{ {x_1},...,{x_i},...,{x_N}\},\nonumber \\&\qquad \; \, {x_i} \ne i,\nonumber \\&\qquad \;\, {x_i} \in \{ 1,2,...,N\}. \end{aligned}$$

(1)

$$\begin{aligned}&\mathrm{DIV}({\varvec{X}_i}) = \frac{1}{M}\sum \limits _{m = 1}^{M} {\mathrm{div}({\varvec{X}_i}} ,{\varvec{X}_m})\nonumber \\&\mathrm{div}(\varvec{X}_i, \varvec{X}_m) = \frac{1}{N}\sum \limits _{j = 1}^N {\mu \left( {x_j^i},{x_j^m}\right) } \nonumber \\&\mathrm{where} \quad \mu (c,d) = \left\{ {\begin{array}{*{20}{c}} {1,}&{}{c \ne d}\\ {0,}&{}{c = d} \end{array}} \right. \end{aligned}$$

(2)

where $\varvec{X}=\{x_1,...,x_i,...,x_N \}$ is the decision vector to be optimized, M is the number of solutions in the current population. For any individual $\varvec{X}=\{x_1,...,x_i,...,x_N \}$, the i-th gene $x_i=t$ represents the sample $\varvec{a}_t$ connects with the i-th sample $\varvec{a}_i$, and we denote it as sample $\varvec{a}_{x_i}$ in the following parts. The objective function $f_1(\varvec{X})$ measures the average similarity between sample $\varvec{a}_i$ and $\varvec{a}_{x_i}$, where $i=\{1,2,...,N\}$, and $\mathrm{sim}(\varvec{a}, \varvec{b})$ represents the similarity between samples $\varvec{a}$ and $\varvec{b}$. Note that different similarity measurements can be adopted as the objective functions in real application, and we use the Euclidean distance in this paper. $\mathrm{DIV}(\varvec{X})$ measures the diversity of the solution $\varvec{X}$, $f_2(\varvec{X})$ is a dynamic objective function since its value changes with the generation. Note that the constraint $x_i\ne i$ should be satisfied, it means that only the similarity between different samples can be measured.

There have been a variety of literatures focus on the diversity of MOEAs [6, 46], it has pointed out in many literatures that the decision space should be taken into consideration [43] when estimate the diversity of solutions. The decision space-based diversity estimation usually significantly outperforms the objective space based strategies especially in solving multiobjective optimization problems, such as [34, 38, 49]. In this paper, the diversity measurement is not only adopted in the objective functions, but also in the diversity preservation strategy. According to formula (2), $DIV(\varvec{X}_i)$ measures the diversity of the solution $\varvec{X}_i$ in the decision space regarding the character of the similarity matrix construction problem. In theoretical analysis, $DIV(\varvec{X}_i)\in [0,1]$. For a solution $\varvec{X}_i$, the more it differs from other solutions, the higher its diversity is after this estimation.

Furthermore, unsupervised clustering without any guidance is usually unreliable [32], so it is necessary to extend the above unsupervised clustering model to semi-supervised learning model. Suppose sample $\varvec{a}_i$ and $\varvec{a}_j$ are labeled samples belonging to different clusters, $\varvec{a}_i\in \varvec{C}_l$, $\varvec{a}_j\in \varvec{C}_m$, $l\ne m$. The semi-supervised spectral clustering can be modeled as Eq. (3) by adding some constraints involved by these labeled samples:

$$\begin{aligned} \begin{aligned} \min \;&\left\{ {\begin{array}{*{20}{l}} {{f_1(\varvec{X})} = \frac{1}{N}\sum \limits _{i = 1}^N {\mathrm{sim}({\varvec{a}_i},{\varvec{a}_{{x_i}}})} }\\ {{f_2(\varvec{X})} = 1-\mathrm{DIV}(\varvec{X}) } \end{array}} \right. \\ \mathrm{s.t.} \quad&\varvec{X} = \{ {x_1},...,{x_i},...,{x_N}\},\\&{x_i} \ne i,\\&{x_i} \ne j,\\&{x_j} \ne i,\\&{x_i} \in \{ 1,2,...,N\}.\\ \end{aligned} \end{aligned}$$

(3)

where the constraints $x_i\ne j$, $x_j\ne i$ guarantee that samples with different labels do not connect with each other.

Evolutionary operators

According to the objective functions, we use a graph-based representation encoding scheme in PESC. The coding length is N, and the size of the searching space is $(N-1)^N $. To obtain a set of high-quality initial individuals, a specific initialization scheme is designed (see Algorithm 3). This initialization scheme is based on the assumption that the corresponding entry in the similarity matrix prefers to be a nonzero entry if two samples locate in each other’s neighborhood. In each individual $\varvec{X}_i=\{x^i_1,x^i_2,...,x^i_j,...x^i_N\}$, the value of $x_j^i$ is decided by the Euclidean distance between sample $\varvec{a}_j$ and other samples. In Algorithm 3, the parameter k controls the width of neighborhood, and its value can affect the convergence speed of PESC, it should be larger than log(N) according to [44].

The adopted crossover and mutation operators are the expansion of classic SBX (simulated binary crossover) and polynomial mutation, respectively. Both of them have good performance on the validity of PESC, which will be demonstrated in the experiment section.

Phase II: Pareto ensemble based weight determination

When the cycling loop finishes, we can obtain a set of nondominated solutions $\{\varvec{X}_1,...,\varvec{X}_m,...,\varvec{X}_M \}$ where $\varvec{X}_m=\{x_1^m,x_2^m,...,x_N^m\}$. The existed Pareto ensemble methods introduced in “Introduction” are not suited for PESC since that they are based on label representation or prototype representation but not the graph-based representation encoding scheme. Besides, the objective functions adopted in PESC make it inappropriate to use those cluster ensemble schemes. In this section, we focus on the task that how to convert these nondominated solutions into a sparse similarity matrix with the proposed methods.

In general, when we convert $\{\varvec{X}_1,...,\varvec{X}_m,...,\varvec{X}_M \}$ to a sparse similarity matrix $\varvec{S}$, the following rule should be satisfied: if $x_i^m=j$, it means $s_{ij}$ is a nonzero entry, otherwise, $s_{ij}$ is a zero entry.

For example, if we have a population of 3 individuals $\varvec{X}_1=\{2\ 1\ 4\ 3\ 4\}$, $\varvec{X}_2=\{2\ 1\ 5\ 3\ 4\}$ and $\varvec{X}_3=\{2\ 3\ 5\ 3\ 3\}$, then the nonzero entries of the converted similarity matrix are $\{s_{12}, s_{21}, s_{23}, s_{34}, s_{35}, s_{43}, s_{53}, s_{54}\}$.

However, the value of each nonzero entry can not be determined by phase I, so we design the following three methods to complete this task.

1.
Sparse Representation based single objective optimization (denoted as ES1 in the following experiment section):
$$\begin{aligned}&\min \quad \Vert \varvec{A} \varvec{S}-\varvec{A}\Vert ^2_2\nonumber \\&\mathrm{s.t.}\quad s_{ij}>0,\quad \mathrm{if}\ x_i^m=j,\nonumber \\&\qquad \;\, s_{ij}=0,\quad \mathrm{else}. \end{aligned}$$
(4)
This strategy is based on sparse representation which is inspired by our previous work [20], it tries to minimize the reconstruction error $\Vert \varvec{A} \varvec{S}-\varvec{A}\Vert ^2_2$, where $\varvec{A}$ is not only the over-complete dictionary, but also the measurement matrix. When we use this strategy to obtain the similarity matrix $\varvec{S}$, which entries are nonzero values have been decided according to the previous rule, implied by the constraints, so what we need to do in this task is to determine the weight of each nonzero entry. A lot of evolutionary algorithms can be adopted here, such as particle swarm optimization (PSO) and Genetic Algorithm (GA). In our experiment, we adopt simplified PSO to complete this task since its easy implementation and fast convergence.
2.
Diversity ensemble (denoted as ES2 in the following experiment section):
$$\begin{aligned}&{s_{ij}} = \frac{1}{M}\sum \limits _{m = 1}^M (1-DIV(\varvec{X}_m)) \cdot \delta \left( x_i^m, j\right) \nonumber \\&\delta \left( x_i^m, j\right) = \left\{ {\begin{array}{*{20}{l}} {1,}&{}\quad {x_i^m = j}\\ {0,}&{}\quad {x_i^m \ne j} \end{array}} \right. \end{aligned}$$
(5)
Table 1 The value of $div(\varvec{X}_i,\varvec{X}_m)$
Full size table
where M is the number of nondominated solutions. This similarity construction strategy is related to the diversity of each nondominated solution. Suppose $s_{ij}$ is a nonzero entry, when we try to determine its value, we should find the nondominated solutions which $x_i^m=j$ , and calculate their diversity according to (2) first. Then we take the average diversity value as the value of the nonzero entry $s_{ij}$. Metaphorically speaking, $\delta (x_i^m, j)$ are in charge of finding the solution which agrees to connect sample $\varvec{a}_i$ with $\varvec{a}_j$, and the solution which has better diversity contributes more to the final weights of $s_{ij}$. Note that when we we construct $\varvec{S}$, only the nondominated solutions are considered. Take the above $\varvec{X}_1$, $\varvec{X}_2$ and $\varvec{X}_3$ for example here, the value of $div(\varvec{X}_i,\varvec{X}_m)$ is calculated according to formula (2) and shown in Table 1, the diversity $DIV(\varvec{X}_1)$, $DIV(\varvec{X}_2)$, $DIV(\varvec{X}_3)$ are $\frac{4}{15}$, $\frac{1}{5}$, $\frac{1}{3}$, respectively. Therefore, the value of the nonzero entry $s_{12}$ is decided by individual $\varvec{X}_1$, $\varvec{X}_2$, and $\varvec{X}_3$ in accordance with formula (5), and $s_{12}=\frac{1}{3}(DIV(\varvec{X}_1)+DIV(\varvec{X}_2)+DIV(\varvec{X}_3))=\frac{11}{15}$. In the same way, the similarity matrix can be obtained, and $S= \left( {\begin{array}{*{20}{c}} 0 &{} \frac{11}{15} &{} 0 &{} 0 &{} 0\\ \frac{22}{45}&{} 0 &{} \frac{11}{45} &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} \frac{11}{45} &{} \frac{22}{45}\\ 0 &{} 0 &{} \frac{11}{15} &{} 0 &{} 0\\ 0 &{} 0 &{} \frac{22}{45} &{} \frac{11}{45} &{} 0 \end{array}} \right) $
3.
Kernel function (denoted as ES3 in the following experiment section):
$$\begin{aligned} {s_{ij}} = \exp \left( \frac{{ - \mathrm{dist}_{ij}^2}}{{2{\sigma ^2}}}\right) \end{aligned}$$
(6)
where $\mathrm{dist}_{ij}$ is the Euclidean distance between sample $\varvec{a}_i$ and $\varvec{a}_j$, $\sigma $ is the scale factor to control the width of the neighborhood.

When we compare the performance of the previous strategies, all of them can get satisfying results, but ES1 is time-consuming in contrast with others, ES3 is easy to implement, but we need to choose a suitable value for the scale factor when dealing with different clustering problems. ES2 can get a balance between time complexity and clustering accuracy. The effect of these three ensemble strategies will be discussed in the experiment section.

According to the Step2 of the spectral clustering in Algorithm 1, the similarity matrix has to transform into a Laplacian matrix $\varvec{L}$ which is symmetric and positive semi-definite. So the following methods mentioned in [20] is needed to complete the transformation:

$$\begin{aligned} \begin{array}{l} s_{ij}=\max \{s_{ij},s_{ji}\}\\ {d_{ij}} = \left\{ {\begin{array}{*{20}{l}} {0,}&{}\quad {i \ne j}\\ {\sum \limits _{m = 1}^N {{s_{im}}} },&{}\quad {i = j} \end{array}} \right. \\ \varvec{L}=\varvec{D}-\varvec{S} \end{array} \end{aligned}$$

(7)

$\varvec{D}\in {\mathbb {R}}^{N\times N}$ is a diagonal matrix with the diagonal elements $d_{ii}$ whose values equals to the sum of the i-th column of weight matrix $\varvec{S}$.

Table 2 The character of adopted UCI datasets

Full size table

Complexity analysis

1.
Space complexity Part of the memory in our algorithm is used to store the distance matrix among all the samples, and its space complexity is $O(N^2)$. The rest are mainly used to store the population whose space complexity is $O(pop\cdot N)$.
2.
Time complexity In this algorithm, the main time cost lies in the working cycle of the MOEA. All the time complexity of initialization, crossover, mutation, and evaluation is $O(pop\cdot N)$. The time complexity of the diversity preservation strategy is $O(n_{div}\cdot N \cdot pop)$, where $n_{div}$ is the number of individuals to implement diversity preservation strategy. The time complexity of the updating of each generation relies on the MOEA adopted. In our experiment, PESC is carried out on the basis of NSGA-II, the time complexity of this step is of $O((2pop)^2)$. The time complexity of Step4 (Algorithm 1) also depends on the first K eigenvectors calculation method adopted. Hence, the total time complexity of PESC is simplified as $\max \{O(pop\cdot gen\cdot N), O(N^2)\}$.

Experiments and discussion

PESC is mainly carried out on the basis of NSGA-II, we show the experiment results and give a discussion on them in this section, including the investigation of parameters, the overall comparison between PESC and traditional or multiobjective clustering algorithms, and the detailed discussion of PESC.

The investigation of parameters

In the experiment section, 11 UCI supervised classification datasets are adopted to test the efficiency of PESC, the number of clusters is fixed to the real number of classes, the clustering accuracy is measured in terms of the percent of instances that are right classified, and all the experiments are obtained from 30 independent runs. The main character of these datasets are shown in Table 2, from which we can see that the number of samples ranges from 150 to 5000. Therefore, we take ‘wine’, ‘balance’, and ‘waveform’ as examples to make an investigation of parameter $p_c$, $p_m$, pop and gen, which are shown in Figs. 1, 2 and 3.

Figure 1 shows the clustering accuracy against the parameter $p_c$ from 0.5 to 0.9 with interval 0.1 on dataset “wine”, balance” and “waveform”. We can observe that when we choose $p_c=0.9$, three tested datasets get relatively higher clustering accuracy.

On the effect of parameter pop, the scale of the dataset is a major factor to be considered. When the other parameters are fixed ($p_c=0.9$, the evaluation time is set to 20,000), the effect of pop to the clustering accuracy with pop ranges from 10 to 100 is shown in Fig. 2. An observation, that the value of pop is related to the scale of the dataset, can be derived from this figure. When we obtain a satisfying clustering result, the larger the scale of the dataset is, usually the higher the value of parameter pop should be. According to literature [44], the number of the neighbors in kNN similarity construction method k is set to log(N) (N is the number of samples). In contrast with kNN, pop has the similar effect on controlling the size of the neighborhood, but the accuracy of PESC is insensitive to this parameter when the value is large enough as in the observation. When pop is set to a large enough value, the width of the neighborhood will be determined adaptively in the evolutionary process. According to the theoretical analysis and the experimental result, the value of pop should be larger than log(N). In our following experiment, its value is set to no less that $\sqrt{N}$ to obtain a satisfying result. Take the scale of the tested datasets into consideration, pop is set to 100 according to our experiment (see Fig. 2).

Figure 3 shows the clustering accuracy with respect to the parameter gen. In this figure, $p_c$=0.9, pop=100, gen is set to the maximum value 100, and the clustering accuracy is tested in every 5 generations from 0 to 100. We can see that the PESC has converged on the tested dataset since the 20-th iteration from Fig. 3.

In general, take the stability and time complexity of PESC into consideration, the value of parameters pop, gen, $p_c$ and $p_m$ ($p_m=1- p_c$) are set to 100, 100, 0.9, 0.1, respectively.

The effect of evolutionary operators

As described in “Evolutionary operators”, we designed a specific initialization strategy which is biased to the neighborhood samples. To test its efficiency, we make a comparison between the proposed initialization strategy and the random initialization (see Fig. 4). In the random initialization strategy, $x_j^i$ (the i-th individual is represented as $\varvec{X}_i=\{x_1^i,x_2^i,...,x_j^i,...,,x_N^i\}$) is set to a random value selected from 1 to N. In Fig. 3, gen is set to 300, and the clustering accuracy is tested in every five generations, we can see that the convergence speed of random initialization scheme is low especially for large scale dataset. Analyzing Figs. 2 and 4, we can draw the conclusion that the proposed initialization scheme can generate a set of high-quality solutions and speed up the convergence rate.

The similarity matrix comparison

In spectral clustering, the similarity matrix can reveal the relationship between samples. To evaluate the effect of the similarity matrix obtained from PESC, a visualization of the similarity matrix and the corresponding eigenvalues and eigenvectors are shown in Fig. 5. In this experiment, We take ‘wine’ for illustration since that it is very clear to see the relationship between the different clusters (‘wine’ has 178 samples, 13 attributes, and three categories, with samples 1–59 belonging to category ‘1’, samples 60–130 belonging to category ‘2’, and the rest are in category ‘3’). In Fig. 5, we compare PESC with the sparse spectral clustering algorithm SRMOSC and the traditional similarity matrix construction methods to discuss their performance. In the similarity matrix, the maximum and minimum weights are represented as the white and black pixels respectively for visualization purposes.

When we compare the three ensemble strategies proposed in PESC, we can see that they got similar performance, but the similarity matrices are quite different. The weights obtained by ES1 are lower than that obtained by ES2 and ES3. However, they have the similar properties that the number of nonzero entries is less than that of zero entries and the nonzero entries are mostly distributed as intra-class connections.

In contrast with SRMOSC, ES1 of PESC obtains similar similarity matrix with it. Both of the similarity matrices are sparse, and the values of the nonzero entries are lower compared with that obtained from the other similarity matrix construction methods. What’s more, the time complexity and space complexity of PESC is less than that of SRMOSC, and the corresponding experiment is shown in the next section.

In contrast with the conventional similarity construction methods (fully connected, kNN, mutual kNN, and $\epsilon $-neighborhood similarity matrix construction methods), the proposed algorithm PESC (row 1–row 3) and kNN (row 6) have better performance, but the rest algorithms cannot extract distinguishable information from the eigenvectors to carry out the clustering task exactly. We also can see that it is not available to use the eigenvectors of fully connected, mutual kNN, and $\epsilon $-neighborhood similarity matrices to classify the samples into exact clusters. Furthermore, it is a difficult and time-consuming task to decide the value of parameters in the conventional similarity construction methods, while PESC can determine the neighborhood of a sample automatically in the evolutionary process.

In general, we can obtain the following conclusions from Fig. 5: (1) the similarity matrices obtained from three proposed ensemble strategies are sparse matrices, and all of them can get satisfying clustering result; (2) the nonzero entries are mainly distributed as intra-class connections if the similarity matrix can provide effective distinguishable information.

Table 3 Clustering accuracy comparison obtained from pesc against moea based clustering algorithms including cluster ensemble algorithms on real-life datasets

Full size table

Table 4 Clustering accuracy comparison obtained from pesc against conventional similarity construction-based algorithms on real-life datasets

Full size table

Experimental comparison between PESC and other algorithms

In this section, we will carry out an overall comparison between PESC and other algorithms, including the comparison between PESC and multiobjective clustering algorithms (SRMOSC and MOCK), the comparison between PESC and other cluster ensemble strategies or algorithms ($MOGA(\pi ,sep)$ and CSPA), and the comparison between PESC and other conventional similarity matrix construction methods. Among these comparison algorithms, $MOGA(\pi ,sep)$, CSPA, MOCK, and SRMOSC are all MOEA based clustering algorithms, MOCK is also regarded as a multiobjective cluster-ensemble algorithm since that its initialization is a hybrid of link-based clustering and k-means. Furthermore, the experiments are also extended to semi-supervised clustering with 10% and 20% labeled samples. All these experiments are tested on 11 UCI datasets over 30 independent runs to demonstrate the effectiveness of PESC.

The parameter setting of the comparison algorithms

The value of pop, gen, $p_c$, $p_m$ are set to 100, 100, 0.7, 0.3 in SRMOSC, MOCK according to [14, 20]. CSPA was carried out on the basis of multiobjective genetic algorithm (MOGA), and share the same objective function with $MOGA(\pi , sep)$. So the parameter setting in these two algorithms are the same, which pop, gen, $p_c$, $p_m$ are set to 100, 100, 0.8 and 0.2, respectively [24]. When constructing the similarity matrix using fully connected kNN and mutual kNN construction methods, the Gaussian Kernel $K(\varvec{x},\varvec{y})=\mathrm{e}^{-\frac{||\varvec{x}-\varvec{y}||^2}{2\sigma ^2}}$ is adopted to calculate the similarity. We carry out the experiment with the following values $\{0.001, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 15\}$ for $\sigma $, and choose the one that outputs the best clustering result as the final value. For kNN and mutual kNN methods, k is set to log(N). $\epsilon $ is selected from the value {0.2, 0.3, 0.4, 0.5, 0.6} with the best clustering result in the $\epsilon $-neighborhood method.

The comparison result of unsupervised clustering algorithms

Tables 3 and 4 show the experiment result obtained from PESC against other MOEA based clustering algorithms and the conventional similarity matrix construction algorithms on real-life datasets with no labeled samples, respectively. The best result in these tables is written in bold. In Table 3, $MOGA(\pi ,sep)$ and CSPA are MOEA based cluster ensemble algorithms. We can see that all the three Pareto ensemble strategies proposed in PESC have good performance and outperform other cluster ensemble algorithms, which gives an empirical support of the proposed Pareto ensemble strategies on the evaluated problems. In contrast with the sparse representation based multiobjective spectral clustering algorithm SRMOSC, PESC works better than SRMOSC on most datasets. Although PESC does not make a huge improvement on the clustering accuracy than SRMOSC, it is more time saving in contrast with it. When comparing PESC with multiobjective clustering algorithms MOCK, we can see that PESC outperforms MOCK on most of the tested detasets.

In addition, when we make a comparison between PESC and conventional similarity matrix based spectral clustering algorithms, PESC outperforms fully connected, mutual kNN, and $\epsilon $-neighborhood on all the tested datasets significantly, and performs better than kNN. In contrast with these traditional algorithms, PESC overcomes the difficulty in deciding the width of the neighborhood controlling by the values of parameters in conventional methods, such as k in kNN or mutual kNN, $\epsilon $ in $\epsilon $-neighborhood, and the scale factor $\sigma $ in fully connected similarity matrix construction method.

Table 5 Semi-supervised clustering with 10% labelled data obtained from pesc against other algorithms on real-life datasets

Full size table

Table 6 Semi-supervised clustering with 20% labelled data obtained from pesc against other algorithms on real-life datasets

Full size table

Take the time cost of all the algorithms into consideration, we give a time evaluation comparison between PESC and other algorithms (see Fig. 6) under the same experiment conditions. In this figure, the vertical axes represents the time cost which is plot in logarithmic scale. The proposed three ensemble strategies are represented by “PESC_ES1”, “PESC_ES2”, “PESC_ES3”, respectively. We can get the following conclusion according to this experiment considering the clustering accuracy: (1) the three proposed ensemble strategies obtained the similar clustering accuracy, however, ES1 is the most time-consuming in these three strategies since that it adopts a simplified PSO algorithm to solve the optimization problem. Hence, we prefer to adopt ‘ES2’ as the ensemble strategy considering both clustering accuracy and time complexity; (2) in contrast with the other MOEA based clustering algorithms (including SRMOSC, MOCK, $MOGA(\pi ,sep)$, and CSPA), the time cost of PESC is comparative with $MOGA(\pi ,sep)$ and CSPA, and quite lower than that of SRMOSC and MOCK. It demonstrated that the Pareto ensemble strategies not only make improvement in the aspect of time complexity which is a burden in many MOEA clustering algorithms, but also in the aspect of clustering accuracy; (3) when we compare PESC with the traditional similarity matrix construction methods, though the time complexity of PESC is higher than them, it overcomes the difficulty of parameter value determination in constructing the similarity matrix. In general, we can derive that PESC makes a balance between time complexity and clustering accuracy.

The comparison results of semi-supervised clustering algorithms

Tables 5 and 6 show the comparison results between PESC and other semi-supervised clustering algorithms with 10% and 20% labeled samples. SRMOSC transforms the semi-supervised information to a constraint to satisfy. Semi-MOCK handles semi-supervised information with an objective function called ‘Adjusted Rand Index’, which is an external measure of clustering quality. In traditional similarity matrix construction methods, all the entries of the labelled samples with the same label are set to the maximum value of its similarity matrix, and the corresponding entries with different labels are set to 0. From these tables, we can see that PESC, no matter which ensemble strategy is adopted, gains better performance than other multiobjective clustering algorithms on most of the tested datasets.

Conclusion

Multiobjective evolutionary algorithm has gained a growing concern in the field of data mining according to the recent literatures. A lot of related works are proposed in the last decades, but few researches focus on the topic of Pareto ensemble or spectral clustering. In this paper, we introduce multiobjective evolutionary algorithm into spectral clustering, and propose a Pareto ensemble based framework for spectral clustering. The main contributions can be summarized as follows.

First and the most important is that we model the similarity matrix construction task as a Pareto ensemble problem, it not only overcomes the difficulty in determining the value of parameters in the conventional methods, but also time-saving than the current existed multiobjective spectral clustering algorithm.

Second, to solve the proposed model effectively, we design a specific diversity preservation strategy for it and also employ this strategy in one of the proposed ensemble strategies. The experiment results derive that this diversity preservation strategy is effective and time-saving than the other ensemble strategies. In addition, a specific neighbor-biased initialization strategy is also proposed, it helps to improve convergence speed and reduce calculation amount during the evolutionary process.

Furthermore, we also extend this model to semi-supervised clustering by transforming the semi-supervised information to some constraints of MOPs, and detailed experiments show the efficiency in handling the related problems.

At last, PESC mainly focuses on the construction of the similarity matrix, so it is easy to extend it to the related field, such as subspace learning.

In general, PESC can obtain a satisfying result in not only the clustering accuracy, but also the time and space complexity in contrast with other multiobjective clustering algorithms. The goal of making a balance between time cost and clustering accuracy is achieved by employing MOEA and ensemble strategy in this paper.

References

Albukhanajer WA, Jin Y, Briffa JA (2014) Neural network ensembles for image identification using pareto-optimal features. In: Evolutionary computation, pp 89–96
Cai D, Chen X (2014) Large scale spectral clustering via landmark-based sparse representation. IEEE Trans Cybern 45(8):1669–1680
Google Scholar
Chen H, Yao X (2010) Multiobjective neural network ensembles based on regularized negative correlation learning. IEEE Trans Knowl Data Eng 22(12):1738–1751
Article Google Scholar
Chen WY, Song Y, Bai H, Lin CJ, Chang EY (2010) Parallel spectral clustering in distributed systems. IEEE Trans Pattern Anal Mach Intell 33(3):568–586
Article Google Scholar
Cheng S, Lu H, Lei X, Shi Y (2018) A quarter century of particle swarm optimization. Complex Intell Syst 64:227–239
Article Google Scholar
Chi KC, Yuen SY (2012) A multiobjective evolutionary algorithm that diversifies population by its density. IEEE Trans Evol Comput 16(2):149–172
Article Google Scholar
Coello CAC, Brambila, SG, Gamboa JF, Tapia MGC, G$^{\cdot \cdot }$$\textregistered $ mez RH (2020) Evolutionary multiobjective optimization: open research areas and some challenges lying ahead. Complex Intell Syst 6:221–236
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Article Google Scholar
Garcia-Piquer A, Fornells A, Bacardit J, Orriols-Puig A, Golobardes E (2014) Large-scale experimental evaluation of cluster representations for multiobjective evolutionary clustering. IEEE Trans Evol Comput 18(1):36–53
Article Google Scholar
Garza-Fabre M, Handl J, Knowles J (2018) An improved and more scalable evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 22(4):515–535
Article Google Scholar
Gong Z, Chen H, Yuan B, Yao X (2019) Multiobjective learning in the model space for time series classification. IEEE Trans Cybern 49(3):918–932
Article Google Scholar
Hamad D, Biela P (2008) Introduction to spectral clustering. In: Information and communication technologies: from theory to applications. ICTTA 2008. 3rd International Conference on, pp 1–6
Handl J, Knowles J (2005) Improvements to the scalability of multiobjective clustering. In: IEEE Congress on evolutionary computation, vol 3, pp 2372–2379
Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76
Article Google Scholar
Hui L, Zhang Q, Deng J (2017) Biased multiobjective optimization and decomposition algorithm. IEEE Trans Cybern 47(1):52–66
Article Google Scholar
Karakaya G, Galelli S, Ahipasaoglu SD, Taormina R (2016) Identifying (quasi) equally informative subsets in feature selection problems for classification: a max-relevance min-redundancy approach. IEEE Trans Cybern 46(6):1424–1437
Article Google Scholar
Li L, Yao X, Stolkin R, Gong M, He S (2014) An evolutionary multiobjective approach to sparse reconstruction. IEEE Trans Evol Comput 18(6):827–845
Article Google Scholar
Li X, Wong K (2019) Evolutionary multiobjective clustering and its applications to patient stratification. IEEE Trans Cybern 49(5):1680–1693
Article Google Scholar
Lu H, Zhang R, Li S, Li X (2013) Spectral segmentation via midlevel cues integrating geodesic and intensity. IEEE Trans Cybern 43(6):2170–2178
Article Google Scholar
Luo J, Jiao L, Lozano JA (2016) A sparse spectral clustering framework via multi-objective evolutionary algorithm. IEEE Trans Evol Comput 20(3):418–433
Article Google Scholar
Maier M, Hein M, Von Luxburg U (2009) Optimal construction of k-nearest neighbor graphs for identifying noisy clusters. Theor Comput Sci 410(19):1749–1764
Article MathSciNet Google Scholar
Mukhopadhyay A, Bandyopadhyay S, Maulik U (2009) Analysis of microarray data using multiobjective variable string length genetic fuzzy clustering. In: IEEE Congress on evolutionary computation, 2009. CEC’09, pp 1313–1319
Mukhopadhyay A, Maulik U (2011) A multiobjective approach to MR brain image segmentation. Appl Soft Comput 11(1):872–880
Article Google Scholar
Mukhopadhyay A, Maulik U, Bandyopadhyay S (2009) Multiobjective genetic algorithm-based fuzzy clustering of categorical attributes. IEEE Trans Evol Comput 13(5):991–1005
Article Google Scholar
Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello CAC (2014) Survey of multiobjective evolutionary algorithms for data mining: part I. IEEE Trans Evol Comput 18(1):4–19
Article Google Scholar
Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello CAC (2014) Survey of multiobjective evolutionary algorithms for data mining: part II. IEEE Trans Evol Comput 18(1):20–35
Article Google Scholar
Nguyen BH, Xue B, Andreae P, Ishibuchi H, Zhang M (2020) Multiple reference points-based decomposition for multiobjective feature selection in classification: static and dynamic mechanisms. IEEE Trans Evol Comput 24(1):170–184
Article Google Scholar
Nie F, Li J, Li X (2016) Parameter-free auto-weighted multiple graph learning: a framework for multiview clustering and semi-supervised classification. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), pp 1881–1887
Nie F, Wang H, Deng C, Gao X, Li X, Huang H (2016) New $\ell $1-norm relaxations and optimizations for graph clustering, pp 1962–1968
Nie F, Wang X, Huang H (2014) Clustering and projected clustering with adaptive neighbors. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 977–986
Nie F, Wang X, Jordan MI, Huang H (2016) The constrained Laplacian rank algorithm for graph-based clustering. In: Thirtieth AAAI Conference on Artificial Intelligence
Nie F, Xu D, Li X (2012) Initialization independent clustering with actively self-training method. IEEE Trans Syst Man Cybern Part B 42(1):17–27
Article Google Scholar
Nie F, Zeng Z, Tsang IW, Xu D, Zhang C (2011) Spectral embedded clustering: A framework for in-sample and out-of-sample spectral clustering. IEEE Trans Neural Netw 22(11):1796–808
Article Google Scholar
Preuss M, Kausch C, Bouvy C, Henrich F (2010) Decision space diversity can be essential for solving multiobjective real-world problems. Springer, Berlin, Heidelberg
Book Google Scholar
Ripon KSN, Siddique MNH (2009) Evolutionary multi-objective clustering for overlapping clusters detection. In: Evolutionary computation. CEC’09. IEEE Congress on, pp 976–982
Ripon KSN, Tsang CH, Kwong S (2006) Multi-objective data clustering using variable-length real jumping genes genetic algorithm and local search method. In: Neural Networks, 2006. IJCNN ’06. International Joint Conference on
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20(20):53–65
Article Google Scholar
Shir OM, Preuss M, Naujoks B, Emmerich M (2009) Enhancing decision space diversity in evolutionary multiobjective algorithms. In: Evolutionary Multi-Criterion Optimization, International Conference, EMO 2009, Proceedings, , April 7–10, 2009. Nantes, pp 95–109
Smith C, Doherty J, Jin Y (2014) Multi-objective evolutionary recurrent neural network ensemble for prediction of computational fluid dynamic simulations. In: Evolutionary computation, pp 61–85
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3(3):583–617
MathSciNet MATH Google Scholar
Tan KC, Lee TH, Khor EF (2001) Evolutionary algorithms with dynamic population size and local exploration for multiobjective optimization. IEEE Trans Evol Comput 5(6):565–588
Article Google Scholar
Topchy AP, Jain AK, Punch WF (2004) A mixture model for clustering ensembles. In: Siam International Conference on Data Mining, Lake Buena Vista, Florida
Ulrich T, Bader J, Zitzler E (2010) Integrating decision space diversity into hypervolume-based multiobjective search. In: Genetic and evolutionary computation conference, GECCO 2010, Proceedings, Portland, Oregon, pp 455–462
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Article MathSciNet Google Scholar
Wan Y, Ma A, Zhong Y, Hu X, Zhang L (2020) Multiobjective hyperspectral feature selection based on discrete sine cosine algorithm. IEEE Trans Geosci Remote Sens 58(5):3601–3618
Article Google Scholar
Wang H, Jin Y, Yao X (2017) Diversity assessment in many-objective optimization. IEEE Trans Cybern 47(6):1510–1522
Article Google Scholar
Wang P (1983) Pattern recognition with fuzzy objective function algorithms. SIAM Rev 25(3):442–442
Google Scholar
Wright J, Ma Y, Mairal J, Sapiro G, Huang TS, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031–1044
Article Google Scholar
Xia H, Zhuang J, Yu D (2014) Combining crowding estimation in objective and decision space with multiple selection and search strategies for multi-objective evolutionary optimization. IEEE Trans Cybern 44(3):378–393
Article Google Scholar
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(8):841–847
Article Google Scholar
Zhang Q, Li H (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731
Article Google Scholar
Zhang X, Jiao L, Liu F, Bo L, Gong M (2008) Spectral clustering ensemble applied to SAR image segmentation. IEEE Trans Geosci Remote Sens 46(7):2126–2136
Article Google Scholar
Zitzler E, Laumanns M, Thiele L (2002) SPEA2: Improving the strength pareto evolutionary algorithm. Evol Methods Design Optim Control 3242:95–100
Google Scholar

Download references

Acknowledgements

This work is supported by the National Key R&D Program of China under grant 2018AAA0101200, and the National Natural Science Foundation of China under Grant 61806019.

Author information

Authors and Affiliations

Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunication, Beijing, China
Juanjuan Luo & Huadong Ma
Northern Institute of Electronic Equipment of China, Beijing, China
Dongqing Zhou

Authors

Juanjuan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Huadong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Dongqing Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juanjuan Luo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Luo, J., Ma, H. & Zhou, D. A pareto ensemble based spectral clustering framework. Complex Intell. Syst. 7, 495–509 (2021). https://doi.org/10.1007/s40747-020-00215-7

Download citation

Received: 04 July 2020
Accepted: 03 October 2020
Published: 02 November 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s40747-020-00215-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A pareto ensemble based spectral clustering framework

Abstract

Similar content being viewed by others

Adaptive population structure learning in evolutionary multi-objective optimization

An evolutionary many-objective algorithm based on decomposition and hierarchical clustering selection

An effective multiobjective approach for hard partitional clustering

Introduction

Conventional similarity matrix construction