Skip to main content
Log in

Exploring nonnegative and low-rank correlation for noise-resistant spectral clustering

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Clustering has been extensively explored in pattern recognition and data mining in order to facilitate various applications. Due to the presence of data noise, traditional clustering approaches may become vulnerable and unreliable, thereby degrading clustering performance. In this paper, we propose a robust spectral clustering approach, termed Non-negative Low-rank Self-reconstruction (NLS), which simultaneously a) explores the nonnegative low-rank properties of data correlation as well as b) adaptively models the structural sparsity of data noise. Specifically, in order to discover the intrinsic correlation among data, we devise a self-reconstruction approach to jointly consider the nonnegativity and low-rank property of data correlation matrix. Meanwhile, we propose to model data noise via a structural norm, i.e., p,2-norm, which not only naturally conforms to genuine patterns of data noise in real-world situations, but also provides more adaptivity and flexibility to different noise levels. Extensive experiments on various real-world datasets illustrate the advantage of the proposed robust spectral clustering approach compared to existing clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

References

  1. Aha, D.W., Kibler, D.F., Albert, M.K.: Instance-based prediction of real-valued attributes, vol. 5 (1989)

  2. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE T PAMI 19(7), 711–720 (1997)

    Article  Google Scholar 

  3. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), 1373–1396 (2003)

    Article  Google Scholar 

  4. Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. JMLR 7, 2399–2434 (2006)

    MathSciNet  MATH  Google Scholar 

  5. Borjigin, S, Guo, C.: Perturbation analysis for the normalized Laplacian matrices in the multiway spectral clustering method. Sci. China Inf. Sci. 57(11), 112102 (2014). https://doi.org/10.1007/s11432-014-5156-y

    Article  MathSciNet  MATH  Google Scholar 

  6. Cai, J.-F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4), 1956–1982 (2010)

    Article  MathSciNet  Google Scholar 

  7. Cendrowska, J.: Prism: an algorithm for inducing modular rules. International Journal of Man-Machine Studies 27(4), 349–370 (1987)

    Article  Google Scholar 

  8. Chen, L., Xu, D., Tsang, I.-H., Li, X.: Spectral embedded hashing for scalable image retrieval. IEEE TCYB 44(7), 1180–1190 (2014)

    Google Scholar 

  9. De la Torre, F., Kanade, T.: Discriminative cluster analysis. In: ICML, pp. 241–248 (2006)

  10. Deng, F., Lu, J., Wang, S., Pan, J., Zhang, L.: A distributed PDP model based on spectral clustering for improving evaluation performance. World Wide Web 22(4), 1555–1576 (2019)

    Article  Google Scholar 

  11. Ding, C., Li, T.: Adaptive dimension reduction using discriminant analysis and k-means clustering. In: ICML, pp. 521–528 (2007)

  12. Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: CVPR, pp. 2790–2797 (2009)

  13. Felzenszwalb, P., Huttenlocher, D.: Efficient graph-based image segmentation. IJCV 59(2), 167–181 (2004)

    Article  Google Scholar 

  14. Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. PR 41(1), 176–190 (2008)

    MATH  Google Scholar 

  15. Gordon, S., Greenspan, H., Goldberger, J.: Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations. In: CVPR, pp. 370–377 (2003)

  16. Graham, D., Allinson, N.: Characterising virtual eigensignatures for general purpose face recognition. NATO ASI series. Series F: Computer and System Sciences 163, 446–456 (1998)

    Google Scholar 

  17. Hammouda, K., Kamel, M.: Efficient phrase-based document indexing for web document clustering. TKDE 16(10), 1279–1296 (2004)

    Google Scholar 

  18. He, S., Wang, B., Wang, Z., Yang, Y., Shen, F., Huang, Z., Shen, H.T.: Bidirectional discrete matrix factorization hashing for image search. IEEE Transactions on Cybernetics (2019)

  19. Hu, M., Yang, Y., Shen, F., Xie, N., Hong, R., Shen, H.T.: Collective reconstructive embeddings for cross-modal hashing. IEEE Trans. Image Process. 28(6), 2770–2784 (2019)

    Article  MathSciNet  Google Scholar 

  20. Huang, Q., Wang, T., Tao, D., Li, X.: Biclustering learning of trading rules. IEEE TCYB 45(10), 2287–2298 (2015)

    Google Scholar 

  21. Jain, A., Murty, M., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Article  Google Scholar 

  22. Jia, J., Yu, N., Hua, X.: Annotating personal albums via web mining. In: ACM Multimedia, pp. 459–468 (2008)

  23. Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. TKDE 16(11), 1370–1386 (2004)

    Google Scholar 

  24. Li, J., Wang, J.: Real-time computerized annotation of pictures. TPAMI 30 (6), 985–1002 (2008)

    Article  Google Scholar 

  25. Li, C., Chang, E., Garcia-Molina, H., Wiederhold, G.: Clustering for approximate similarity search in high-dimensional spaces. TKDE 14(4), 792–808 (2002)

    Google Scholar 

  26. Li, J., Lu, K., Huang, Z., Zhu, L., Shen, H.T.: Heterogeneous domain adaptation through progressive alignment. IEEE TNNLS 30(5), 1381–1391 (2018)

    MathSciNet  Google Scholar 

  27. Li, J., Lu, K., Huang, Z., Zhu, L., Shen, H.T.: Transfer independently together: a generalized framework for domain adaptation. IEEE TCYB 49(6), 2144–2155 (2018)

    Google Scholar 

  28. Li, C., Xu, Z., Quiao, C., Luo, T.: Hierarchical clustering driven by cognitive features. Sci. China Inf. Sci. 57(1), 12109 (2014). https://doi.org/10.1007/s11432-013-4858-x

    Article  MathSciNet  Google Scholar 

  29. Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. IEEE TPAMI 35(1), 171–184 (2013)

    Article  Google Scholar 

  30. Liu, G., Xu, H., Tang, J., Liu, Q., Yan, S.: A deterministic analysis for lrr. IEEE TPAMI 38(3), 417–430 (2016)

    Article  Google Scholar 

  31. Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with gabor wavelets. In: Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200–205 (1998)

  32. Meng, D., Leung, Y., Xu, Z.: Detecting intrinsic loops underlying data manifold. IEEE Trans. Knowl. Data Eng. 25(2), 337–347 (2013)

    Article  Google Scholar 

  33. Nie, F., Xu, D., Tsang, I., Zhang, C.: Spectral embedded clustering. In: IJCAI, pp. 1181–1186 (2009)

  34. Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint 2, 1-norms minimization. In: NIPS, pp. 1813–1821 (2010)

  35. Shi, J., Malik, J.: Normalized cuts and image segmentation. TPAMI 22(8), 888–905 (2000)

    Article  Google Scholar 

  36. Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. JMLR 3, 583–617 (2003)

    MathSciNet  MATH  Google Scholar 

  37. Wang, X., Zhang, L., Li, X., Ma, W.: Annotating images by mining image search results. TPAMI 30(11), 1919–1932 (2008)

    Article  Google Scholar 

  38. Wang, F., Zhang, C., Li, T.: Clustering with local and global regularization. TKDE 21(12), 1665–1678 (2009)

    Google Scholar 

  39. Wang, Z., Na, C., Ma, Z., Chen, S., Song, L., Yang, Y.: Exploring nonnegative and low-rank correlation for noise-resistant spectral clustering. In: Apweb-Waim, pp. 12–26 (2019)

    Chapter  Google Scholar 

  40. Wu, M., Scholkopf, B.: A local learning approach for clustering. NIPS 19, 1529–1536 (2007)

    Google Scholar 

  41. Xiao, Y., Zhu, Z., Zhao, Y., Wei, Y., Wei, S., Li, X.: Topographic nmf for data representation. IEEE TCYB 44(10), 1762–1771 (2014)

    Google Scholar 

  42. Yang, Y., Xu, D., Nie, F., Yan, S., Zhuang, Y.: Image clustering using local discriminant models and global integration. TIP 19(10), 2761–2773 (2010)

    MathSciNet  MATH  Google Scholar 

  43. Yang, Y., Shen, H., Nie, F., Ji, R., Zhou, X.: Nonnegative spectral clustering with discriminative regularization. In: AAAI, pp. 555–560 (2011)

  44. Yang, Y., Yang, Y., Shen, H.T., Zhang, Y., Du, X., Zhou, X.: Discriminative nonnegative spectral clustering with out-of-sample extension. IEEE TKDE 25(8), 1760–1771 (2013)

    Google Scholar 

  45. Yang, Y., Ma, Z., Yang, Y., Nie, F., Shen, H.T.: Multitask spectral clustering by exploring intertask correlation. IEEE Trans. Cyber. 45(5), 1083–1094 (2015)

    Article  Google Scholar 

  46. Yang, Y., Shen, F., Shen, H.T., Li, H., Li, X.: Robust discrete spectral hashing for large-scale image semantic indexing. IEEE Transactions on Big Data 1(4), 162–171 (2015)

    Article  Google Scholar 

  47. Yang, Y., Shen, F., Huang, Z., Shen, H.T.: A unified framework for discrete spectral clustering. In: IJCAI, pp. 2273–2279 (2016)

  48. Yang, Y., Shen, F., Huang, Z., Shen, H.T., Li, X.: Discrete nonnegative spectral clustering. IEEE Trans. Knowl. Data Eng. 29(9), 1834–1845 (2017)

    Article  Google Scholar 

  49. Ye, J., Zhao, Z., Liu, H.: Adaptive distance metric learning for clustering. In: CVPR, pp. 1–7 (2007)

  50. Ye, J., Zhao, Z., Wu, M.: Discriminative k-means for clustering. NIPS 20, 1649–1656 (2007)

    Google Scholar 

  51. Yu, S., Shi, J.: Multiclass Spectral Clustering. In: ICCV, pp. 313–319 (2003)

  52. Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. NIPS 17, 1601–1608 (2004)

    Google Scholar 

  53. Zhang, Z., Liu, L., Shen, F., Shen, H.T., Shao, L.: Binary multi-view clustering. IEEE TPAMI 41(7), 1774–1782 (July 2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lin Zuo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Lin Zuo and Zheng Wang contributed equally to this work.

This article belongs to the Topical Collection: Special Issue on Web and Big Data 2019

Guest Editors: Jie Shao, Man Lung Yiu, and Toyoda Masashi

Appendices

Appendix A: Proof of Lemma 1

Proof

Inspired by [34], we consider the following function

$$ f(a) = p a^{2} - 2 a^{p} + (2 - p), $$
(28)

where p ∈ (0, 2). We expect to show that when a > 0, f(a) ≥ 0. The first and second order derivatives of the function in (28) are \(f^{\prime }(a ) = 2pa - 2p{a^{p - 1}}\) and \(f^{\prime \prime }(a) = 2p - 2p(p - 1){a^{p - 2}}\), respectively. We can see that a = 1 is the only point that satisfies \(f^{\prime }(a)=0\). Also, when 0 < a < 1, \(f^{\prime }(a )<0\) and when a > 1, \(f^{\prime }(a )>0\). This means that f(a) is monotonically decreasing when 0 < a < 1 and monotonically increasing when a > 1. Moreover, we have \(f^{\prime \prime }(1)=2p(2-p)>0\). Therefore, for ∀a > 0, f(a) ≥ f(1) = 0.

Then, by substituting \(a = \frac {\| \tilde {\varepsilon }_{i} \|}{\|\varepsilon _{i}\|}\) into (28), we obtain the conclusion

$$ \begin{array}{@{}rcl@{}} && p\frac{\| \tilde{\varepsilon}_{i} \|^{2}}{\| \varepsilon_{i} \|^{2}} - 2 \frac{\| \tilde{\varepsilon}_{i} \|^{p}}{\| \varepsilon_{i} \|^{p}} + (2 - p) \ge 0,\\ &\Leftrightarrow& p\| \tilde{\varepsilon}_{i} \|^{2} - 2\| \tilde{\varepsilon}_{i} \|^{p}\| \varepsilon_{i} \|^{2 - p} + (2 - p)\| \varepsilon_{i} \|^{2} \ge 0,\\ &\Leftrightarrow& p\| \tilde{\varepsilon}_{i} \|^{2}\| \varepsilon_{i} \|^{p - 2} - 2\| \tilde{\varepsilon}_{i} \|^{p} + (2 - p)\| \varepsilon_{i} \|^{p} \ge 0,\\ &\Leftrightarrow& 2\| \tilde{\varepsilon}_{i} \|^{p} - p\| \tilde{\varepsilon}_{i} \|^{2}\| \varepsilon_{i} \|^{p - 2} \le (2 - p)\| \varepsilon_{i} \|^{p},\\ &\Leftrightarrow& \| \tilde{\varepsilon}_{i} \|^{p} - \frac{p\| \tilde{\varepsilon}_{i} \|^{2}}{2\| \varepsilon_{i} \|^{2-p}} \le \| \varepsilon_{i} \|^{p} - \frac{p\| \varepsilon_{i} \|^{p}}{2\| \varepsilon_{i} \|^{2 - p}}. \end{array} $$

Appendix B: Proof of Theorem 1

Proof

Denote \({\mathcal{L}}(\mathcal {E}) = \frac {1}{2} \left \| \mathcal {E} - \left (X - XW + \frac {P}{\alpha }\right ) \right \|_{F}^{2}\) and \(\lambda ^{\prime } = \lambda /\alpha \). Suppose \(\tilde {\mathcal {E}}\) is the optimized solution of the alternative problem (18), then we obtain the following conclusion:

$$ \begin{array}{@{}rcl@{}} && \mathcal{L}(\tilde{\mathcal{E}}) + \lambda^{\prime} Tr(\tilde{\mathcal{E}}^{T} Z \tilde{\mathcal{E}}) \le \mathcal{L} (\mathcal{E}) + \lambda^{\prime} Tr(\mathcal{E}^{T} Z \mathcal{E})\\ &\Rightarrow& \mathcal{L} (\tilde{\mathcal{E}}) + \lambda^{\prime} \sum\limits_{i = 1}^{n} \frac{p\| \tilde{\varepsilon}_{i} \|^{2}}{2\| \varepsilon_{i} \|^{2 - p}} \le \mathcal{L} (\mathcal{E}) + \lambda^{\prime} \sum\limits_{i = 1}^{n} \frac{p\| \varepsilon_{i} \|^{2}}{2\| \varepsilon_{i} \|^{2 - p}} \\ &\Rightarrow& \mathcal{L} (\tilde{\mathcal{E}}) + \lambda^{\prime} \sum\limits_{i = 1}^{n} \| \tilde{\varepsilon}_{i} \|_{p}^{2} - \lambda^{\prime} \left( \sum\limits_{i = 1}^{n} \| \tilde{\varepsilon}_{i} \|_{p}^{2} - \sum\limits_{i = 1}^{n} \frac{p\| \tilde{\varepsilon}_{i} \|^{2}}{2\| \varepsilon_{i} \|^{2 - p}} \right)\\ &\le& \mathcal{L} (\mathcal{E}) + \lambda^{\prime} \sum\limits_{i = 1}^{n} \| \varepsilon_{i} \|_{p}^{2} - \lambda^{\prime} \left( \sum\limits_{i = 1}^{n} \| \varepsilon_{i} \|_{p}^{2} - \sum\limits_{i = 1}^{n} \frac{p\| \varepsilon_{i} \|^{2}}{2\| \varepsilon_{i} \|^{2 - p}}\right). \end{array} $$

Given the conclusion of Lemma 2, we finally arrive at

$$ \mathcal{L} (\tilde{\mathcal{E}}) + \lambda^{\prime} \sum\limits_{i = 1}^{n} \| \tilde{\varepsilon}_{i} \|_{p}^{2} \le \mathcal{L} (\mathcal{E}) + \lambda^{\prime} \sum\limits_{i = 1}^{n} \| \varepsilon_{i} \|_{p}^{2}. $$

Hence, the value of the objective function in (17) monotonically decreases in each iteration. □

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Zuo, L., Ma, J. et al. Exploring nonnegative and low-rank correlation for noise-resistant spectral clustering. World Wide Web 23, 2107–2127 (2020). https://doi.org/10.1007/s11280-020-00802-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-020-00802-1

Keywords

Navigation