Skip to main content

Advertisement

Log in

HTRPCA: Hypergraph Regularized Tensor Robust Principal Component Analysis for Sample Clustering in Tumor Omics Data

  • Original Research Article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

In recent years, clustering analysis of cancer genomics data has gained widespread attention. However, limited by the dimensions of the matrix, the traditional methods cannot fully mine the underlying geometric structure information in the data. Besides, noise and outliers inevitably exist in the data. To solve the above two problems, we come up with a new method which uses tensor to represent cancer omics data and applies hypergraph to save the geometric structure information in original data. This model is called hypergraph regularized tensor robust principal component analysis (HTRPCA). The data processed by HTRPCA becomes two parts, one of which is a low-rank component that contains pure underlying structure information between samples, and the other is some sparse interference points. So we can use the low-rank component for clustering. This model can retain complex geometric information between more sample points due to the addition of the hypergraph regularization. Through clustering, we can demonstrate the effectiveness of HTRPCA, and the experimental results on TCGA datasets demonstrate that HTRPCA precedes other advanced methods.

Graphic Abstract

This paper proposes a new method of using tensors to represent cancer omics data and introduces hypergraph items to save the geometric structure information of the original data. At the same time, the model decomposes the original tensor into low-order tensors and sparse tensors. The low-rank tensor was used to cluster cancer samples to verify the effectiveness of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Laxman N, Rubin C-J, Mallmin H, Nilsson O, Tellgren-Roth C, Kindmark A (2016) Second generation sequencing of microRNA in human bone cells treated with parathyroid hormone or dexamethasone. Bone 84:181–188. https://doi.org/10.1016/j.bone.2015.12.053

    Article  CAS  PubMed  Google Scholar 

  2. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA 68(6):394–424. https://doi.org/10.3322/caac.21492

    Article  PubMed  Google Scholar 

  3. Liu JX, Gao YL, Zheng CH, Xu Y, Yu J (2016) Block-constraint robust principal component analysis and its application to integrated analysis of TCGA data. IEEE Trans Nanobiosci 15(6):510–516. https://doi.org/10.1109/TNB.2016.2574923

    Article  Google Scholar 

  4. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1–3):37–52. https://doi.org/10.1016/0169-7439(87)80084-9

    Article  CAS  Google Scholar 

  5. Chun-Mei F, Ying-Lian G, Jin-Xing L, Juan W, Dong-Qin W, Chang-Gang W (2017) Joint L1/2-norm constraint and graph-Laplacian PCA method for feature extraction. Biomed Res Int 2017:5073427. https://doi.org/10.1155/2017/5073427

    Article  CAS  Google Scholar 

  6. Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2012) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184. https://doi.org/10.1109/TPAMI.2012.88

    Article  Google Scholar 

  7. Babacan SD, Luessi M, Molina R, Katsaggelos AK (2012) Sparse Bayesian methods for low-rank matrix estimation. IEEE Trans Signal Process 60(8):3964–3977. https://doi.org/10.1109/TSP.2012.2197748

    Article  Google Scholar 

  8. Balkau CLB, Fezeu L, Tichet J, De Lauzonguillain B, Czernichow S, Fumeron F, Froguel P, Vaxillaire M, Cauchi S (2008) Predicting diabetes: clinical, biological, and genetic approaches: data from the Epidemiological Study on the Insulin Resistance Syndrome (DESIR). Diabetes Care 31(10):2056–2061. https://doi.org/10.2337/dc08-0368

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Chen J, Yang J (2013) Robust subspace segmentation via low-rank representation. IEEE Trans Cybernet 44(8):1432–1445. https://doi.org/10.1109/TCYB.2013.2286106

    Article  Google Scholar 

  10. Liu J, Wang Y, Zheng C, Sha W, Mi J, Xu Y (2013) Robust PCA based method for discovering differentially expressed genes. BMC Bioinform BioMed Central 14(8):1–10. https://doi.org/10.1186/1471-2105-14-S8-S3

    Article  CAS  Google Scholar 

  11. Zheng C, Yuan L, Sha W, Sun Z (2014) Gene differential coexpression analysis based on biweight correlation and maximum clique. BMC Bioinform BioMed Central 15(15):1–7. https://doi.org/10.1186/1471-2105-15-S15-S3

    Article  Google Scholar 

  12. Lu C, Feng J, Chen Y, Liu W, Lin Z, Yan S (2016) Tensor robust principal component analysis: exact recovery of corrupted low-rank tensors via convex optimization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5249–5257. https://doi.org/10.1109/CVPR.2016.567

  13. Hu Y, Liu JX, Gao YL, Li SJ, Wang J (2019) Differentially expressed genes extracted by the tensor robust principal component analysis (TRPCA) method. Complexity 2019:6136245. https://doi.org/10.1155/2019/6136245

    Article  Google Scholar 

  14. Chen CF, Wei CP, Wang YCF (2012) Low-rank matrix recovery with structural incoherence for robust face recognition. In: 2012 IEEE conference on computer vision and pattern recognition, pp 2618–2625. https://doi.org/10.1109/CVPR.2012.6247981

  15. Zhou P, Feng J (2017) Outlier-robust tensor PCA. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2263–2271. https://doi.org/10.1109/CVPR.2017.419.

  16. Renard N, Bourennane S, Blanc-Talon J (2008) Denoising and dimensionality reduction using multilinear tools for hyperspectral images. IEEE Geosci Remote Sens Lett 5(2):138–142. https://doi.org/10.1109/LGRS.2008.915736

    Article  Google Scholar 

  17. Tao D, Jin L, Liu W, Li X (2013) Hessian regularized support vector machines for mobile image annotation on the cloud. IEEE Trans Multimedia 15(4):833–844. https://doi.org/10.1109/TMM.2013.2238909

    Article  Google Scholar 

  18. Liu W, Tao D (2013) Multiview Hessian regularization for image annotation. IEEE Trans Image Process 22(7):2676–2687. https://doi.org/10.1109/TIP.2013.2255302

    Article  PubMed  Google Scholar 

  19. Nie Y, Chen L, Zhu H, Du S, Yue T, Cao X (2017) Graph-regularized tensor robust principal component analysis for hyperspectral image denoising. Appl Opt 56(22):6094–6102. https://doi.org/10.1364/AO.56.006094

    Article  PubMed  Google Scholar 

  20. Yu N, Gao Y-L, Liu J-X, Wang J, Shang J (2019) Robust hypergraph regularized non-negative matrix factorization for sample clustering and feature selection in multi-view gene expression data. Hum Genom 13(1):1–10. https://doi.org/10.1186/s40246-019-0222-6

    Article  Google Scholar 

  21. Yu J, Rui Y, Chen B (2013) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimedia 16(1):159–168. https://doi.org/10.1109/TMM.2013.2284755

    Article  Google Scholar 

  22. Kilmer ME, Martin CD (2011) Factorization strategies for third-order tensors. Linear Algebra Appl 435(3):641–658. https://doi.org/10.1016/j.laa.2010.09.020

    Article  Google Scholar 

  23. Candes EJ, Xiaodong L, Yi M, Wright J (2011) Robust principal component analysis? JACM 58(3):1–37. https://doi.org/10.1145/1970392.1970395

    Article  Google Scholar 

  24. Jin T, Yu J, You J, Zeng K, Li C, Yu Z (2015) Low-rank matrix factorization with multiple hypergraph regularizer. Pattern Recogn 48(3):1011–1022. https://doi.org/10.1016/j.patcog.2014.09.002

    Article  Google Scholar 

  25. Zeng K, Yu J, Li C, You J, Jin T (2014) Image clustering by hyper-graph regularized non-negative matrix factorization. Neurocomputing 138:209–217. https://doi.org/10.1016/j.neucom.2014.01.043

    Article  Google Scholar 

  26. Chen C, He B, Ye Y, Yuan X (2016) The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent. Math Program 155(1–2):57–59. https://doi.org/10.1007/s10107-014-0826-5

  27. Zhang Z, Ely G, Aeron S, Hao N, Kilmer ME (2014) Novel methods for multilinear data completion and de-noising based on tensor-SVD. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3842–3849. https://doi.org/10.1109/CVPR.2014.485

  28. Yu N, Wu M-J, Liu J-X, Zheng C-H, Xu Y (2020) Correntropy-based hypergraph regularized NMF for clustering and feature selection on multi-cancer integrated data. IEEE Trans Cybern 32603306:1–12. https://doi.org/10.1109/TCYB.2020.3000799

    Article  Google Scholar 

  29. Cai D, He X, Han J, Huang TS (2010) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560. https://doi.org/10.1109/TPAMI.2010.231

    Article  PubMed  Google Scholar 

  30. Yu F, Liu L, Yu N, Ji L, Qiu D (2020) A method of L1-norm principal component analysis for functional data. Symmetry 12(1):182. https://doi.org/10.3390/sym12010182

    Article  Google Scholar 

  31. Guo Q, Wu W, Massart DL, Boucon C, Jong SD (2002) Feature selection in principal component analysis of analytical data. Chemom Intell Lab Syst 61(1–2):123–132. https://doi.org/10.1016/S0169-7439(01)00203-9

    Article  CAS  Google Scholar 

  32. Oh T-H, Tai Y-W, Bazin J-C, Kim H, Kweon IS (2015) Partial sum minimization of singular values in robust PCA: algorithm and applications. IEEE Trans Pattern Anal Mach Intell 38(4):744–758. https://doi.org/10.1109/TPAMI.2015.2465956

    Article  PubMed  Google Scholar 

  33. Lu C, Feng J, Chen Y, Liu W, Lin Z, Yan S (2019) Tensor robust principal component analysis with a new tensor nuclear norm. IEEE Trans Pattern Anal Mach Intell 42(4):925–938. https://doi.org/10.1109/TPAMI.2019.2891760

    Article  PubMed  Google Scholar 

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61872220.

Author information

Authors and Affiliations

Authors

Contributions

YYZ and CNJ proposed the HTRPCA method, performed the experiments, and drafted the manuscript. MLW and JXL contributed to the design of the study and manuscript. JW and CHZ contributed to the data analysis. JXL contributed to improving the writing of manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jin-Xing Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, YY., Jiao, CN., Wang, ML. et al. HTRPCA: Hypergraph Regularized Tensor Robust Principal Component Analysis for Sample Clustering in Tumor Omics Data. Interdiscip Sci Comput Life Sci 14, 22–33 (2022). https://doi.org/10.1007/s12539-021-00441-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-021-00441-8

Keywords

Navigation