Skip to main content
Log in

Anomaly and Novelty detection for robust semi-supervised learning

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Three important issues are often encountered in Supervised and Semi-Supervised Classification: class memberships are unreliable for some training units (label noise), a proportion of observations might depart from the main structure of the data (outliers) and new groups in the test set may have not been encountered earlier in the learning phase (unobserved classes). The present work introduces a robust and adaptive Discriminant Analysis rule, capable of handling situations in which one or more of the aforementioned problems occur. Two EM-based classifiers are proposed: the first one that jointly exploits the training and test sets (transductive approach), and the second one that expands the parameter estimation using the test set, to complete the group structure learned from the training set (inductive approach). Experiments on synthetic and real data, artificially adulterated, are provided to underline the benefits of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

Download references

Acknowledgements

The authors are grateful to Anna Sandionigi, Lorenzo Guzzetti, Maurizio Casiraghi and Massimo Labra for fruitful discussions and domain-knowledge sharing for our Grapevine microbiome analyses for detection of provenances and varieties. In particular, authors thank Anna Sandionigi for her decisive help in performing the routines described in Sect. 5.2.2 and for her support throughout the must samples analysis. We also would like to thank the Editor, Associate Editor and Referees whose suggestions and comments enhanced the quality of the paper. Brendan Murphy’s work was supported by the Science Foundation Ireland Insight Research Centre (12/RC/2289_P2) and Vistamilk Research Centre (16/RC/3835).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Cappozzo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 586 KB)

Appendix A: Inductive covariance matrices estimation

Appendix A: Inductive covariance matrices estimation

This appendix provides closed form solutions for the estimation of the covariance matrices \(\varvec{\varSigma }_h\), \( h=G+1,\ldots ,E\) of the unobserved classes via the inductive approach; our main reference here is the seminal paper of Celeux and Govaert (1995), where patterned covariance matrices were firstly defined and algorithms for their ML estimation were proposed. In the robust discovery phase only the parameters for the \(H=E-G\) densities need to be estimated, according to the available patterned models, given the one considered in the Learning Phase (see Fig.  5). Denote with \({\varvec{W}}_h=\sum _{m=1}^{M^{*}} \varphi (\mathbf {y}^{*}_m){\hat{z}}^{*}_{mh}\left[ \left( \mathbf {y}^{*}_{m}-\hat{\varvec{\mu }}_{h}\right) \left( \mathbf {y}^{*}_{m}-\hat{\varvec{\mu }}_{h}\right) ^{\prime }\right] \) and let \({\varvec{W}}_h={\varvec{L}}_h\varvec{\varDelta }_h{\varvec{L}}^{'}_h\) be its eigenvalue decomposition. Further, consider \(n_h=\sum _{m=1}^{M^{*}} \varphi (\mathbf {y}^{*}_m){\hat{z}}^{*}_{mh}\) for \(h=G+1,\ldots , E\). Lastly, denote with a bar the estimates obtained in the robust learning phase for the G known groups: they are fixed and should not be changed. The formulae needed for the parameter updates are as follows:

  • VII model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{I}}\)

    $$\begin{aligned} {\hat{\lambda }}_h= \frac{\hbox {tr}({\varvec{W}}_h)}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • VEI model: \(\varvec{\varSigma }_h=\lambda _h \bar{{\varvec{A}}}\)

    $$\begin{aligned} {\hat{\lambda }}_h= \frac{\hbox {tr}({\varvec{W}}_h {\bar{A}}^{-1})}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • EVI model: \(\varvec{\varSigma }_h={\bar{\lambda }} {\varvec{A}}_h\)

    $$\begin{aligned} \hat{{\varvec{A}}}_h= \frac{\hbox {diag}({\varvec{W}}_h)}{|\hbox {diag}({\varvec{W}}_h)|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • VVI model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{A}}_h\)

    $$\begin{aligned} {\hat{\lambda }}_h= \frac{|\hbox {diag}({\varvec{W}}_h)|^{1/p}}{n_h}, \qquad h=G+1,\ldots ,E. \\ \hat{{\varvec{A}}}_h= \frac{\hbox {diag}({\varvec{W}}_h)}{|\hbox {diag}({\varvec{W}}_h)|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • VEE model: \(\varvec{\varSigma }_h=\lambda _h \bar{{\varvec{D}}}\bar{{\varvec{A}}}\bar{{\varvec{D}}}^{'}\)

    Let \(\bar{{\varvec{C}}}=\bar{{\varvec{D}}}\bar{{\varvec{A}}}\bar{{\varvec{D}}}^{'}\) and

    $$\begin{aligned} {\hat{\lambda }}_h= \frac{\hbox {tr}({\varvec{W}}_h \bar{{\varvec{C}}}^{-1})}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • EVE model: \(\varvec{\varSigma }_h={\bar{\lambda }} \bar{{\varvec{D}}}{\varvec{A}}_h\bar{{\varvec{D}}}^{'}\)

    $$\begin{aligned} \hat{{\varvec{A}}}_h= \frac{\hbox {diag}(\bar{{\varvec{D}}}^{'}{\varvec{W}}_h\bar{{\varvec{D}}})}{|\hbox {diag}(\bar{{\varvec{D}}}^{'}{\varvec{W}}_h\bar{{\varvec{D}}})|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • EEV model: \(\varvec{\varSigma }_h={\bar{\lambda }} {\varvec{D}}_h\bar{{\varvec{A}}}{\varvec{D}}_h^{'}\)

    $$\begin{aligned} \hat{{\varvec{D}}}_h= {\varvec{L}}_h, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • VVE model: \(\varvec{\varSigma }_h=\lambda _h \bar{{\varvec{D}}}{\varvec{A}}_h\bar{{\varvec{D}}}{'}\)

    Let \({\varvec{R}}_h=\lambda _h {\varvec{A}}_h\)

    $$\begin{aligned} \hat{{\varvec{R}}}_h= \frac{1}{n_h}\hbox {diag}(\bar{{\varvec{D}}}^{'}{\varvec{W}}_h\bar{{\varvec{D}}}), \qquad h=G+1,\ldots ,E. \end{aligned}$$

    and, subsequently

    $$\begin{aligned} {\hat{\lambda }}_h= & {} |\hat{{\varvec{R}}}_h|^{1/p}, \qquad h=G+1,\ldots ,E.\\ \hat{{\varvec{A}}}_h= & {} \frac{1}{{\hat{\lambda }}_h}\hat{{\varvec{R}}}_h, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • VEV model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{D}}_h\bar{{\varvec{A}}}{\varvec{D}}_h^{'}\)

    $$\begin{aligned} \hat{{\varvec{D}}}_h= & {} {\varvec{L}}_h, \qquad h=G+1,\ldots ,E.\\ {\hat{\lambda }}_h= & {} \frac{\hbox {tr}({\varvec{W}}_h \hat{{\varvec{D}}}_h \bar{{\varvec{A}}}^{-1}\hat{{\varvec{D}}}_h{'})}{p \, n_h}, \qquad h=G+1,\ldots ,E. \end{aligned}$$
  • EVV model: \(\varvec{\varSigma }_h={\bar{\lambda }} {\varvec{D}}_h{\varvec{A}}_h{\varvec{D}}_h^{'}\)

    Let \({\varvec{C}}_h= {\varvec{D}}_h{\varvec{A}}_h{\varvec{D}}_h^{'}\)

    $$\begin{aligned} \hat{{\varvec{C}}}_h= \frac{{\varvec{W}}_h}{|{\varvec{W}}_h|^{1/p}}, \qquad h=G+1,\ldots ,E. \end{aligned}$$

    \(\hat{{\varvec{A}}}_h\), \(\hat{{\varvec{D}}_h}\) are obtained through the eigenvalue decomposition of \(\hat{{\varvec{C}}}_h\), \(h=G+1,\ldots ,E\).

  • VVV model: \(\varvec{\varSigma }_h=\lambda _h {\varvec{D}}_h{\varvec{A}}_h{\varvec{D}}_h^{'}\)

    $$\begin{aligned} \hat{\varvec{\varSigma }}_h=\frac{1}{n_h}{\varvec{W}}_h \end{aligned}$$

    \(\hat{\lambda _h}\), \(\hat{{\varvec{A}}}_h\), \(\hat{{\varvec{D}}_h}\) are obtained through the eigenvalue decomposition of \(\hat{\varvec{\varSigma }}_h\), \(h=G+1,\ldots ,E\).

Lastly, it is easy to see that whenever the model in the discovery phase is EII, EEI or EEE, no extra parameters need to be estimated for the covariance matrices of the hidden groups.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cappozzo, A., Greselin, F. & Murphy, T.B. Anomaly and Novelty detection for robust semi-supervised learning. Stat Comput 30, 1545–1571 (2020). https://doi.org/10.1007/s11222-020-09959-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-020-09959-1

Keywords

Navigation