Abstract
Three important issues are often encountered in Supervised and Semi-Supervised Classification: class memberships are unreliable for some training units (label noise), a proportion of observations might depart from the main structure of the data (outliers) and new groups in the test set may have not been encountered earlier in the learning phase (unobserved classes). The present work introduces a robust and adaptive Discriminant Analysis rule, capable of handling situations in which one or more of the aforementioned problems occur. Two EM-based classifiers are proposed: the first one that jointly exploits the training and test sets (transductive approach), and the second one that expands the parameter estimation using the test set, to complete the group structure learned from the training set (inductive approach). Experiments on synthetic and real data, artificially adulterated, are provided to underline the benefits of the proposed method.
Similar content being viewed by others
References
Aitken, A.C.: A series formula for the roots of algebraic and transcendental equations. Proc. R. Soc. Edinb. 45(01), 14–22 (1926). https://doi.org/10.1017/S0370164600024871
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974). https://doi.org/10.1109/TAC.1974.1100705
Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3), 803 (1993). https://doi.org/10.2307/2532201
Bensmail, H., Celeux, G.: Regularized Gaussian discriminant analysis through eigenvalue decomposition. J. Am. Stat. Assoc. 91(436), 1743–1748 (1996). https://doi.org/10.1080/01621459.1996.10476746
Biernacki, C.: Degeneracy in the maximum likelihood estimation of univariate Gaussian mixtures for grouped data and behaviour of the EM algorithm. Scand. J. Stat. 34(3), 569–586 (2007). https://doi.org/10.1111/j.1467-9469.2006.00553.x
Böhning, D., Dietz, E., Schaub, R., Schlattmann, P., Lindsay, B.G.: The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Inst. Stat. Math. 46(2), 373–388 (1994). https://doi.org/10.1007/BF01720593
Bokulich, N.A., Thorngate, J.H., Richardson, P.M., Mills, D.A.: Microbial biogeography of wine grapes is conditioned by cultivar, vintage, and climate. Proc. National Acad. Sci. 111(1), E139–E148 (2014). https://doi.org/10.1073/pnas.1317377110
Bokulich, N.A., Collins, T., Masarweh, C., Allen, G., Heymann, H., Ebeler, S.E., Mills, D.A.: Fermentation behavior suggest microbial contribution to regional. MBio 7(3), 1–12 (2016). https://doi.org/10.1128/mBio.00631-16.Editor
Bolyen, E., Rideout, J.R., Dillon, M.R., Bokulich, N.A., Abnet, C.C., Al-Ghalith, G.A., Alexander, H., Alm, E.J., Arumugam, M., Asnicar, F., Bai, Y., Bisanz, J.E., Bittinger, K., Brejnrod, A., Brislawn, C.J., Brown, C.T., Callahan, B.J., Caraballo-Rodríguez, A.M., Chase, J., Cope, E.K., Da Silva, R., Diener, C., Dorrestein, P.C., Douglas, G.M., Durall, D.M., Duvallet, C., Edwardson, C.F., Ernst, M., Estaki, M., Fouquier, J., Gauglitz, J.M., Gibbons, S.M., Gibson, D.L., Gonzalez, A., Gorlick, K., Guo, J., Hillmann, B., Holmes, S., Holste, H., Huttenhower, C., Huttley, G.A., Janssen, S., Jarmusch, A.K., Jiang, L., Kaehler, B.D., Kang, K.B., Keefe, C.R., Keim, P., Kelley, S.T., Knights, D., Koester, I., Kosciolek, T., Kreps, J., Langille, M.G., Lee, J., Ley, R., Liu, Y.X., Loftfield, E., Lozupone, C., Maher, M., Marotz, C., Martin, B.D., McDonald, D., McIver, L.J., Melnik, A.V., Metcalf, J.L., Morgan, S.C., Morton, J.T., Naimey, A.T., Navas-Molina, J.A., Nothias, L.F., Orchanian, S.B., Pearson, T., Peoples, S.L., Petras, D., Preuss, M.L., Pruesse, E., Rasmussen, L.B., Rivers, A., Robeson, M.S., Rosenthal, P., Segata, N., Shaffer, M., Shiffer, A., Sinha, R., Song, S.J., Spear, J.R., Swafford, A.D., Thompson, L.R., Torres, P.J., Trinh, P., Tripathi, A., Turnbaugh, P.J., Ul-Hasan, S., van der Hooft, J.J., Vargas, F., Vázquez-Baeza, Y., Vogtmann, E., von Hippel, M., Walters, W., Wan, Y., Wang, M., Warren, J., Weber, K.C., Williamson, C.H., Willis, A.D., Xu, Z.Z., Zaneveld, J.R., Zhang, Y., Zhu, Q., Knight, R., Caporaso, J.G.: Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37(8), 852–857 (2019). https://doi.org/10.1038/s41587-019-0209-9
Bouveyron, C.: Adaptive mixture discriminant analysis for supervised learning with unobserved classes. J. Classif. 31(1), 49–84 (2014). https://doi.org/10.1007/s00357-014-9147-x
Bouveyron, C., Girard, S.: Robust supervised classification with mixture models: learning from data with uncertain labels. Pattern Recognit. 42(11), 2649–2658 (2009). https://doi.org/10.1016/j.patcog.2009.03.027
Calle, M.L.: Statistical Analysis of Metagenomics Data. Genom. Inform. 17(1), e6 (2019). https://doi.org/10.5808/GI.2019.17.1.e6
Cappozzo, A., Greselin, F., Murphy, T.B.: A robust approach to model-based classification based on trimming and constraints. Adv. Data Anal. Classif. (2019). https://doi.org/10.1007/s11634-019-00371-w
Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognit. 28(5), 781–793 (1995). https://doi.org/10.1016/0031-3203(94)00125-6
Cerioli, A., García-Escudero, L.A., Mayo-Iscar, A., Riani, M.: Finding the number of normal groups in model-based clustering via constrained likelihoods. J. Comput. Graph. Stat. 27(2), 404–416 (2018). https://doi.org/10.1080/10618600.2017.1390469
Cerioli, A., Farcomeni, A., Riani, M.: Wild adaptive trimming for robust estimation and cluster analysis. Scand. J. Stat. 46(1), 235–256 (2019). https://doi.org/10.1111/sjos.12349
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection. ACM Comput. Surv. 41(3), 1–58 (2009). https://doi.org/10.1145/1541880.1541882
Chiquet, J., Mariadassou, M., Robin, S.: Variational inference for probabilistic Poisson PCA. Ann. Appl. Stat. 12(4), 2674–2698 (2018). https://doi.org/10.1214/18-AOAS1177
Coretto, P., Hennig, C.: Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for Robust Gaussian clustering. J. Am. Stat. Assoc. 111(516), 1648–1659 (2016). https://doi.org/10.1080/01621459.2015.1100996
Day, N.E.: Estimating the components of a mixture of normal distributions. Biometrika 56(3), 463–474 (1969)
Dean, N., Murphy, T.B., Downey, G.: Using unlabelled data to update classification rules with applications in food authenticity studies. J. R. Stat. Soc. Ser. C Appl. Stat. 55(1), 1–14 (2006). https://doi.org/10.1111/j.1467-9876.2005.00526.x
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39(1), 1–38 (1977). https://doi.org/10.2307/2984875
Evangelista, P.F., Embrechts, M.J., Szymanski, B.K.: Taming the curse of dimensionality in kernels and novelty detection. Adv. Soft Comput. 34, 425–438 (2006). https://doi.org/10.1007/3-540-31662-0_33
Fop, M., Mattei, P.A., Murphy, T.B., Bouveyron, C.: (2018) Unobserved classes and extra variables in high-dimensional discriminant analysis. In: CASI 2018 Conference proceeding, pp. 70–72
Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002). https://doi.org/10.1198/016214502760047131
Gallegos, M.T., Ritter, G.: Using combinatorial optimization in model-based trimmed clustering with cardinality constraints. Comput. Stat. Data Anal. 54(3), 637–654 (2010). https://doi.org/10.1016/j.csda.2009.08.023
García-Escudero, L., Gordaliza, A., Mayo-Iscar, A., San Martín, R.: Robust clusterwise linear regression through trimming. Comput. Stat. Data Anal. 54(12), 3057–3069 (2010). https://doi.org/10.1016/j.csda.2009.07.002
García-Escudero, L.A., Gordaliza, A., Matrán, C., Mayo-Iscar, A.: A general trimming approach to robust cluster analysis. Ann. Stat. 36(3), 1324–1345 (2008). https://doi.org/10.1214/07-AOS515
García-Escudero, L.A., Gordaliza, A., Mayo-Iscar, A.: A constrained robust proposal for mixture modeling avoiding spurious solutions. Adv. Data Anal. Classif. 8(1), 27–43 (2014). https://doi.org/10.1007/s11634-013-0153-3
García-Escudero, L.A., Gordaliza, A., Matrán, C., Mayo-Iscar, A.: Avoiding spurious local maximizers in mixture modeling. Stat. Comput. 25(3), 619–633 (2015). https://doi.org/10.1007/s11222-014-9455-3
García-Escudero, L.A., Gordaliza, A., Greselin, F., Ingrassia, S., Mayo-Iscar, A.: The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers. Comput. Stat. Data Anal. 99, 131–147 (2016). https://doi.org/10.1016/j.csda.2016.01.005
García-Escudero, L.A., Gordaliza, A., Greselin, F., Ingrassia, S., Mayo-Iscar, A.: Robust estimation of mixtures of regressions with random covariates, via trimming and constraints. Stat. Comput. 27(2), 377–402 (2017). https://doi.org/10.1007/s11222-016-9628-3
García-Escudero, L.A., Gordaliza, A., Greselin, F., Ingrassia, S., Mayo-Iscar, A.: Eigenvalues and constraints in mixture modeling: geometric and computational issues. Adv. Data Anal. Classif. 12(2), 203–233 (2018a). https://doi.org/10.1007/s11634-017-0293-y
García-Escudero, L.A., Gordaliza, A., Matrán, C., Mayo-Iscar, A.: Comments on “The power of monitoring: how to make the most of a contaminated multivariate sample”. Stat. Methods Appl. 27(4), 661–666 (2018b). https://doi.org/10.1007/s10260-018-00436-8
Gordaliza, A.: Best approximations to random variables based on trimming procedures. J. Approx. Theory 64(2), 162–180 (1991). https://doi.org/10.1016/0021-9045(91)90072-I
Greco, L., Agostinelli, C.: Weighted likelihood mixture modeling and model-based clustering. Stat. Comput. (2019). https://doi.org/10.1007/s11222-019-09881-1
Greselin, F., Punzo, A.: Closed likelihood ratio testing procedures to assess similarity of covariance matrices. Am. Stat. 67(3), 117–128 (2013). https://doi.org/10.1080/00031305.2013.791643
Hawkins, D.M., McLachlan, G.J.: High-breakdown linear discriminant analysis. J. Am. Stat. Assoc. 92(437), 136 (1997). https://doi.org/10.2307/2291457
Hawkins, D.M., Liu, L., Young, S.S.: (2001) Robust singular value decomposition. National Institute of Statistical Science Technical Report 122
Hickey, R.J.: Noise modelling and evaluating learning from examples. Artif. Intell. 82(1–2), 157–179 (1996). https://doi.org/10.1016/0004-3702(94)00094-8
Hubert, M., Rousseeuw, P.J., Vanden Branden, K.: ROBPCA: a new approach to robust principal component analysis. Technometrics 47(1), 64–79 (2005). https://doi.org/10.1198/004017004000000563
Ingrassia, S.: A likelihood-based constrained algorithm for multivariate normal mixture models. Stat. Methods Appl. 13(2), 151–166 (2004). https://doi.org/10.1007/s10260-004-0092-4
Ingrassia, S., Rocci, R.: Degeneracy of the EM algorithm for the MLE of multivariate Gaussian mixtures and dynamic constraints. Comput. Stat. Data Anal. 55(4), 1715–1725 (2011). https://doi.org/10.1016/j.csda.2010.10.026
Kasabov, N., Pang, S.: (2003) Transductive support vector machines and applications in bioinformatics for promoter recognition. In: International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003, IEEE, vol 1, pp 1–6. https://doi.org/10.1109/ICNNSP.2003.1279199, http://ieeexplore.ieee.org/document/1279199/
Li, M., Xiang, S., Yao, W.: Robust estimation of the number of components for mixtures of linear regression models. Comput. Stat. 31(4), 1539–1555 (2016). https://doi.org/10.1007/s00180-015-0610-x
Markou, M., Singh, S.: Novelty detection: a review-part 1: statistical approaches. Signal Process. 83(12), 2481–2497 (2003). https://doi.org/10.1016/j.sigpro.2003.07.018
Mclachlan, G.J., Rathnayake, S.: On the number of components in a Gaussian mixture model. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 4(5), 341–355 (2014). https://doi.org/10.1002/widm.1135
McNicholas, P., Murphy, T., McDaid, A., Frost, D.: Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Comput. Stat. Data Anal. 54(3), 711–723 (2010). https://doi.org/10.1016/j.csda.2009.02.011
Mezzasalma, V., Sandionigi, A., Bruni, I., Bruno, A., Lovicu, G., Casiraghi, M., Labra, M.: Grape microbiome as a reliable and persistent signature of field origin and environmental conditions in Cannonau wine production. PLOS ONE 12(9), e0184615 (2017). https://doi.org/10.1371/journal.pone.0184615
Mezzasalma, V., Sandionigi, A., Guzzetti, L., Galimberti, A., Grando, M.S., Tardaguila, J., Labra, M.: Geographical and cultivar features differentiate grape microbiota in northern Italy and Spain Vineyards. Front. Microbiol. 9(MAY), 1–13 (2018). https://doi.org/10.3389/fmicb.2018.00946
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc, New York (1997)
Neykov, N.M., Filzmoser, P., Dimova, R.I., Neytchev, P.N.: Robust fitting of mixtures using the trimmed likelihood estimator. Comput. Stat Data Anal. 52(1), 299–308 (2007). https://doi.org/10.1016/j.csda.2006.12.024
Nguyen, M.H., de la Torre, F.: Optimal feature selection for support vector machines. Pattern Recognit. 43(3), 584–591 (2010). https://doi.org/10.1016/j.patcog.2009.09.003
Peel, D., McLachlan, G.J.: Robust mixture modelling using the t distribution. Stat. Comput. 10(4), 339–348 (2000). https://doi.org/10.1023/A:1008981510081
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)
Team, R.C.: (2018) R: A Language and Environment for Statistical Computing. https://www.r-project.org/
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846 (1971). https://doi.org/10.2307/2284239
Rousseeuw, P.J., Driessen, K.V.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3), 212–223 (1999). https://doi.org/10.1080/00401706.1999.10485670
Schölkopf, B., Williamson, R., Smola, A., Shawe-Taylor, J., Platt, J.: Support vector method for novelty detection. Adv. Neural Inf. Process. Syst. 12, 582–588 (2000)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978). https://doi.org/10.1214/aos/1176344136
Pang, S., Kasabov, N.: (2004) Inductive vs transductive inference, global vs local models: SVM, TSVM, and SVMT for gene expression classification problems. In: 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541), IEEE, vol 2, pp 1197–1202, https://doi.org/10.1109/IJCNN.2004.1380112, http://ieeexplore.ieee.org/document/1380112/
Tax, D.M.J., Duin, R.P.W.: Outlier detection using classifier instability. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (eds.) Advances in Pattern Recognition, pp. 593–601. Springer, Berlin (1998)
Todorov, V., Filzmoser, P.: An object-oriented framework for Robust multivariate analysis. J. Stat. Softw. 32(3), 1–47 (2009). https://doi.org/10.18637/jss.v032.i03
Vanden Branden, K., Hubert, M.: Robust classification in high dimensions based on the SIMCA Method. Chemom. Intell. Lab. Syst. 79(1–2), 10–21 (2005). https://doi.org/10.1016/j.chemolab.2005.03.002
Vapnik, V.N.: The Nature of Statistical Learning Theory, vol. 3. Springer, New York (2000). https://doi.org/10.1007/978-1-4757-3264-1
Waldron, L.: Data and statistical methods to analyze the human microbiome. mSystems 3(2), 1–4 (2018). https://doi.org/10.1128/mSystems.00194-17
Zhu, X., Wu, X.: Class noise vs. attribute noise: a quantitative study. Artif. Intell. Rev. 22(3), 177–210 (2004). https://doi.org/10.1007/s10462-004-0751-8
Acknowledgements
The authors are grateful to Anna Sandionigi, Lorenzo Guzzetti, Maurizio Casiraghi and Massimo Labra for fruitful discussions and domain-knowledge sharing for our Grapevine microbiome analyses for detection of provenances and varieties. In particular, authors thank Anna Sandionigi for her decisive help in performing the routines described in Sect. 5.2.2 and for her support throughout the must samples analysis. We also would like to thank the Editor, Associate Editor and Referees whose suggestions and comments enhanced the quality of the paper. Brendan Murphy’s work was supported by the Science Foundation Ireland Insight Research Centre (12/RC/2289_P2) and Vistamilk Research Centre (16/RC/3835).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix A: Inductive covariance matrices estimation
Appendix A: Inductive covariance matrices estimation
This appendix provides closed form solutions for the estimation of the covariance matrices \(\varvec{\varSigma }_h\), \( h=G+1,\ldots ,E\) of the unobserved classes via the inductive approach; our main reference here is the seminal paper of Celeux and Govaert (1995), where patterned covariance matrices were firstly defined and algorithms for their ML estimation were proposed. In the robust discovery phase only the parameters for the \(H=E-G\) densities need to be estimated, according to the available patterned models, given the one considered in the Learning Phase (see Fig. 5). Denote with \({\varvec{W}}_h=\sum _{m=1}^{M^{*}} \varphi (\mathbf {y}^{*}_m){\hat{z}}^{*}_{mh}\left[ \left( \mathbf {y}^{*}_{m}-\hat{\varvec{\mu }}_{h}\right) \left( \mathbf {y}^{*}_{m}-\hat{\varvec{\mu }}_{h}\right) ^{\prime }\right] \) and let \({\varvec{W}}_h={\varvec{L}}_h\varvec{\varDelta }_h{\varvec{L}}^{'}_h\) be its eigenvalue decomposition. Further, consider \(n_h=\sum _{m=1}^{M^{*}} \varphi (\mathbf {y}^{*}_m){\hat{z}}^{*}_{mh}\) for \(h=G+1,\ldots , E\). Lastly, denote with a bar the estimates obtained in the robust learning phase for the G known groups: they are fixed and should not be changed. The formulae needed for the parameter updates are as follows:
-
VII model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{I}}\)
$$\begin{aligned} {\hat{\lambda }}_h= \frac{\hbox {tr}({\varvec{W}}_h)}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
VEI model: \(\varvec{\varSigma }_h=\lambda _h \bar{{\varvec{A}}}\)
$$\begin{aligned} {\hat{\lambda }}_h= \frac{\hbox {tr}({\varvec{W}}_h {\bar{A}}^{-1})}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
EVI model: \(\varvec{\varSigma }_h={\bar{\lambda }} {\varvec{A}}_h\)
$$\begin{aligned} \hat{{\varvec{A}}}_h= \frac{\hbox {diag}({\varvec{W}}_h)}{|\hbox {diag}({\varvec{W}}_h)|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
VVI model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{A}}_h\)
$$\begin{aligned} {\hat{\lambda }}_h= \frac{|\hbox {diag}({\varvec{W}}_h)|^{1/p}}{n_h}, \qquad h=G+1,\ldots ,E. \\ \hat{{\varvec{A}}}_h= \frac{\hbox {diag}({\varvec{W}}_h)}{|\hbox {diag}({\varvec{W}}_h)|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
VEE model: \(\varvec{\varSigma }_h=\lambda _h \bar{{\varvec{D}}}\bar{{\varvec{A}}}\bar{{\varvec{D}}}^{'}\)
Let \(\bar{{\varvec{C}}}=\bar{{\varvec{D}}}\bar{{\varvec{A}}}\bar{{\varvec{D}}}^{'}\) and
$$\begin{aligned} {\hat{\lambda }}_h= \frac{\hbox {tr}({\varvec{W}}_h \bar{{\varvec{C}}}^{-1})}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
EVE model: \(\varvec{\varSigma }_h={\bar{\lambda }} \bar{{\varvec{D}}}{\varvec{A}}_h\bar{{\varvec{D}}}^{'}\)
$$\begin{aligned} \hat{{\varvec{A}}}_h= \frac{\hbox {diag}(\bar{{\varvec{D}}}^{'}{\varvec{W}}_h\bar{{\varvec{D}}})}{|\hbox {diag}(\bar{{\varvec{D}}}^{'}{\varvec{W}}_h\bar{{\varvec{D}}})|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
EEV model: \(\varvec{\varSigma }_h={\bar{\lambda }} {\varvec{D}}_h\bar{{\varvec{A}}}{\varvec{D}}_h^{'}\)
$$\begin{aligned} \hat{{\varvec{D}}}_h= {\varvec{L}}_h, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
VVE model: \(\varvec{\varSigma }_h=\lambda _h \bar{{\varvec{D}}}{\varvec{A}}_h\bar{{\varvec{D}}}{'}\)
Let \({\varvec{R}}_h=\lambda _h {\varvec{A}}_h\)
$$\begin{aligned} \hat{{\varvec{R}}}_h= \frac{1}{n_h}\hbox {diag}(\bar{{\varvec{D}}}^{'}{\varvec{W}}_h\bar{{\varvec{D}}}), \qquad h=G+1,\ldots ,E. \end{aligned}$$and, subsequently
$$\begin{aligned} {\hat{\lambda }}_h= & {} |\hat{{\varvec{R}}}_h|^{1/p}, \qquad h=G+1,\ldots ,E.\\ \hat{{\varvec{A}}}_h= & {} \frac{1}{{\hat{\lambda }}_h}\hat{{\varvec{R}}}_h, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
VEV model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{D}}_h\bar{{\varvec{A}}}{\varvec{D}}_h^{'}\)
$$\begin{aligned} \hat{{\varvec{D}}}_h= & {} {\varvec{L}}_h, \qquad h=G+1,\ldots ,E.\\ {\hat{\lambda }}_h= & {} \frac{\hbox {tr}({\varvec{W}}_h \hat{{\varvec{D}}}_h \bar{{\varvec{A}}}^{-1}\hat{{\varvec{D}}}_h{'})}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$ -
EVV model: \(\varvec{\varSigma }_h={\bar{\lambda }} {\varvec{D}}_h{\varvec{A}}_h{\varvec{D}}_h^{'}\)
Let \({\varvec{C}}_h= {\varvec{D}}_h{\varvec{A}}_h{\varvec{D}}_h^{'}\)
$$\begin{aligned} \hat{{\varvec{C}}}_h= \frac{{\varvec{W}}_h}{|{\varvec{W}}_h|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$\(\hat{{\varvec{A}}}_h\), \(\hat{{\varvec{D}}_h}\) are obtained through the eigenvalue decomposition of \(\hat{{\varvec{C}}}_h\), \(h=G+1,\ldots ,E\).
-
VVV model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{D}}_h{\varvec{A}}_h{\varvec{D}}_h^{'}\)
$$\begin{aligned} \hat{\varvec{\varSigma }}_h=\frac{1}{n_h}{\varvec{W}}_h \end{aligned}$$\(\hat{\lambda _h}\), \(\hat{{\varvec{A}}}_h\), \(\hat{{\varvec{D}}_h}\) are obtained through the eigenvalue decomposition of \(\hat{\varvec{\varSigma }}_h\), \(h=G+1,\ldots ,E\).
Lastly, it is easy to see that whenever the model in the discovery phase is EII, EEI or EEE, no extra parameters need to be estimated for the covariance matrices of the hidden groups.
Rights and permissions
About this article
Cite this article
Cappozzo, A., Greselin, F. & Murphy, T.B. Anomaly and Novelty detection for robust semi-supervised learning. Stat Comput 30, 1545–1571 (2020). https://doi.org/10.1007/s11222-020-09959-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-020-09959-1