Skip to main content
Log in

Outlier removal in biomaterial image segmentations using a non-stationary Bayesian learning

  • Industrial and commercial application
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

A Correction to this article was published on 26 July 2021

This article has been updated

Abstract

Segmentation of dried amnion biomaterial tends to produce invalid (outlier) contour point detections due to texture and colour inhomogeneity of the biomaterial. In this paper, a novel implementation of a non-stationary Bayesian learning process for outlier contour point removal of amnion segmentations is presented. This outlier removal method is independent to algorithms used for the contour detection. The Bayesian process uses a non-stationary kernel to learn a function with complex shape that maps image features in a region-of-interest around each contour point to a discrete output. Based on this output, a contour point can be determined as valid or invalid (outlier). The hyper-parameters of the non-stationary kernel are learned by maximising the marginal likelihood of the combined likelihood of data and the prior of the kernel parameters. Moreover, a novel combination of gradient-ascend and harmonic heuristic search methods is presented to find the optimal hyper-parameters. To validate the method, experiments are conducted to detect and ignore invalid contour points on amnion biomaterial images. A comparison of the proposed method with a logistic regression classification as the baseline is performed. The results show that the proposed method can significantly improve the contour detection by removing outliers and, hence, can reduce waste of uncut biomaterials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Change history

Abbreviations

ROI:

Region of interest

ML:

Machine learning

GP:

Gaussian process

HOG:

Histogram of oriented Gradient

LED:

Local edge descriptor

HM:

Harmony memory

HMRC:

Harmony memory considering rate

ROC:

Receiver operating characteristic

AUC:

Area under the ROC curve

TP:

True positive

FP:

False positive

PDF:

Probability density function

References

  1. Allen CL, Clare G, Stewart EA, Branch MJ, McIntosh OD, Dadwhal M, Dua HS, Hopkinson A (2013) Augmented dried versus cryopreserved amniotic membrane as an ocular surface dressing. PLoS ONE 8(e7844):1

    Google Scholar 

  2. Dua HS, Said DG, Messmer EM, Rolando M, Benitez-del-Castillo JM, Hossain PN, Shortt AJ, Geerling G, Nubile M, Figueiredo FC, Rauz S, Mastropasque L, Rama P, Baudouin C (2018) Neurotrophic keratopathy. Prog Retin Eye Res 66:107–131

    Article  Google Scholar 

  3. Dobreva MP, Pereira PN, Deprest J, Zwijsen A (2010) On the origin of amniotic stem cells: of mice and men. Int J of Develop Bio 54:761–777

    Article  Google Scholar 

  4. Taha AA, Hanbury A (2015) Metrics for evaluating 3D medical image segmentation: analysis, selection and tool. BMC Med Imaging 15:29

    Article  Google Scholar 

  5. Chan TF, Vese LA (2001) Active contours without edges. IEEE Trans Img Proc 10:266–277

    Article  MATH  Google Scholar 

  6. Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comp Vis 59:167–181

    Article  Google Scholar 

  7. Boykov Y, Funka-Lea G (2006) Graph cuts and efficient N-D image segmentation. Int J Comp Vis 70:109–131

    Article  Google Scholar 

  8. Shih FY, Cheng S (2005) Automatic seeded region growing for color image segmentation. Img Vis Comp 23:877–886

    Article  Google Scholar 

  9. Ayala HVH, dos Santos FM, Mariani VC, Coelho LS (2015) Image thresholding segmentation based on a novel beta differential evolution approach. Expert Sys with Applications 42:2136–2142

    Article  Google Scholar 

  10. Sharma N, Aggarwal LM (2010) Automated medical image segmentation techniques. J Med Phy 35:3–14

    Article  Google Scholar 

  11. Zhu H, Meng F, Cai J, Lu S (2016) Beyond pixels: A comprehensive survey from bottom-up to semantic image segmentation and cosegmentation. J Vis Comm Img Represent 34:12–27

    Article  Google Scholar 

  12. Sagayam KM, Hemanth DJ, Ramprasad YN, Menon R (2017) Optimization of hand motion recognition system based on 2D HMM approach using ABC algorithm. Hybr Intelli Techniques Pattrn Analysis Understand 30:167–192

    Article  Google Scholar 

  13. Nirala S, Mishra D, Sagayam KM, Ponraj DN, Vasanth XA, Henesey L, Ho CC (2018) Image fusion in remote sensing based on sparse sampling method and PCNN techniques. Machine Learning for Big Data Analysis 1:149

    Article  Google Scholar 

  14. Sagayam KM, Bruntha PM, Sridevi M, Sam MR, Kose U, Deperlioglu O (2020) A cognitive perception on content-based image retrieval using an advanced soft computing paradigm. In: Tapan Gandhi (eds), "Advanced Machine Vision Paradigms for Medical Image Analysis", Elsevier.

  15. Ren X, Malik J (2003) Learning a classification model for segmentation. The 9th IEEE Int Conf on Comp Vis.

  16. Chuang KS, Tzeng HL, Chen S, Wu J, Chen TJ (2006) Fuzzy c-means clustering with spatial information for image segmentation. Comp Med Img Graph 30:9–15

    Article  Google Scholar 

  17. Ricci E, Retical PR (2007) blood vessel segmentation using line operators and support vector classification. IEEE Trans Med Img 26:1357–1365

    Article  Google Scholar 

  18. Pan C, Park DS, Yang Y, Yoo HM (2012) Leukocyte image segmentation by visual attention and extreme learning machine. Neural Comp Applications 21:1217–1227

    Article  Google Scholar 

  19. Drozdzal M, Chartrand G, Vorontsov E, Shakeri M, Di Jorio L, Tang A, Romero A, Bengio Y, Pal C, Kadoury S (2018) Learning normalized inputs for iterative estimation in medical image segmentation. Med Img Analsysis 44:1–13

    Article  Google Scholar 

  20. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution and fully connected CRFs. IEEE Trans Pattrn Analysis Mach Intelli 40:834–848

    Article  Google Scholar 

  21. Iglovikov V, Shvets A (2018) TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation.

  22. Long J, Shelhamer E, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattrn Analys Mach Intell 39:640–651

    Google Scholar 

  23. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowledge Data Eng 22:1345–1359

    Article  Google Scholar 

  24. Weiss K, Khoshgoftaar TM, Wang DD (2016) A survey of transfer learning. J Big Data 3:p.9

  25. Bengio Y (2012) Deep learning of representations for unsupervised and transfer learning. J of Mach Learn Res 27:17–37

    Google Scholar 

  26. Ghahramani Z (2015) Probabilistic machine learning and artificial intelligence. Nature 521:452–459

    Article  Google Scholar 

  27. Bishop CM (2006) Pattern recognition and machine learning. Springer, Singapore

    MATH  Google Scholar 

  28. Kemmler M, Rodner E, Wacker ES, Denzler J (2013) One-class classification with Gaussian process. Pattrn Recog 46:3507–3518

    Article  Google Scholar 

  29. Son Y, Lee S, Park S, Lee J (2018) Learning representative exemplars using one-class Gaussian process regression. Pattern Recogn 74:185–197

    Article  Google Scholar 

  30. Li P, Hierarchical CS (2018) Gaussian processes model for multi-task learning. Pattrn Recog 74:134–144

    Article  Google Scholar 

  31. Liu L, Shao L, Zheng F, Li X (2014) Realistic action recognition via sparsely-constructed Gaussian process. Pattrn Recog 47:3819–3827

    Article  Google Scholar 

  32. Ruiz P, Moralez-Alvarez P, Molina R, Katsaggelos AK (2019) Learning from crowds with variational Gaussian process. Pattrn Recog 88:298–311

    Article  Google Scholar 

  33. Suzuki S, Abe K (1985) Topological structural analysis of digitized binary images by border following. Comp Vis, Graph, Img Proc 30:32–46

    Article  MATH  Google Scholar 

  34. Barber D (2012) Bayesian reasoning and machine learning. Cambridge University Press, United Kingdom

    Book  MATH  Google Scholar 

  35. Rasmussen CE, Williams CKI (2006) Gaussian process for machine learning. The MIT Press, USA

    MATH  Google Scholar 

  36. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection CVPR 1:886–893

    Google Scholar 

  37. Syam WP, Rybalcenko K, Gaio A, Crabtree J, Leach RK (2019) Methodology for the development of in-line optical surface measuring instruments with a case study for additive surface finishing. Opt Lasers Eng 121:271–288

    Article  Google Scholar 

  38. Danzl R, Helmli F, Stefan S (2011) Focus variation – a robust technology for high resolution optical 3D surface metrology. StrojniskiVestnik 57:245–256

    Google Scholar 

  39. Cover TM, Thomas JA (1991) Elements of information theory. John-Wiley & Sons, New York

    Book  MATH  Google Scholar 

  40. Park DK, Jeon YS, Won CS (2000) Efficient use of local edge histogram descriptor. In proc ACM Workshop on Multimedia. 51–54.

  41. Li P, Chen S (2019) Gaussian process approach for metric learning. Pattrn Recog 87:17–28

    Article  Google Scholar 

  42. Ghahramani Z (2013) Bayesian non-parametrics and the probabilistic approach to modelling. Phil Trans Royal Soc A 371:20110553

    Article  MathSciNet  MATH  Google Scholar 

  43. Scholkopf B, Smola AJ (2002) Learning with Kernels. MIT Press, Massachusetts

    MATH  Google Scholar 

  44. Moroni G, Syam WP, Petro S (2014) Performance improvement for optimization of the non-linear geometric fitting problem in manufacturing metrology. Meas Sci Technol. 25: 085008.

  45. Geem ZW, Kim JH, Loganathan GV (2001) A new heuristic optimization algorithm: Harmony search. SIMULATION 76:60–68

    Article  Google Scholar 

  46. Tavazoei MS, Haeri M (2007) An optimization algorithm based on chaotic behavior and fractal nature. J Comp Appl Math 206:1070–1081

    Article  MathSciNet  MATH  Google Scholar 

  47. Fawcett T (2006) An introduction to ROC analysis. Pattrn Recog Lett 27:861–874

    Article  Google Scholar 

  48. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in C. Cambridge University Press, Cambridge

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by Innovate UK (iUK) [grant number 104042] and in collaboration with NuVision Biotherapies Ltd.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wahyudin P. Syam.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Derivation of the posterior distribution estimation for learning

By re-writing Eq. 4, we have:

$$P\left({\varvec{\uptheta}}|D;\,M\right)\propto P\left(D|{\varvec{\uptheta}};\,M\right) P\left({\varvec{\uptheta}};\,M\right)\leftrightarrow P\left({\varvec{\uptheta}}|\mathbf{X},\mathbf{y};\,M\right)\propto P\left(\mathbf{y}|\mathbf{X},{\varvec{\uptheta}};\,M\right) P\left({\varvec{\uptheta}};\,M\right)$$
(21)

Both the likelihood \(P\left(\mathbf{y}|\mathbf{X},\theta ;\,M\right)\) and the prior \(P\left(\theta ;\,M\right)\) are assumed to be Gaussian. The model \(M\) is assumed to be a linear model \(f({\mathbf{X}}^{T}{\varvec{\uptheta}})\). Hence:

$$P\left({\varvec{\uptheta}}|\mathbf{X},\mathbf{y};\,M\right)\propto \mathrm{exp}\left(-\frac{1}{2{\sigma }^{2}}{\left(\mathbf{y}-{\mathbf{X}}^{T}{\varvec{\uptheta}}\right)}^{T}\left(\mathbf{y}-{\mathbf{X}}^{T}{\varvec{\uptheta}}\right)\right)\mathrm{exp}\left(-\frac{1}{2}{{\varvec{\uptheta}}}^{T}{\sum }^{-1}{\varvec{\uptheta}}\right)$$
(22)

where \(\sum\) is a positive semi-definite matrix with size that follows the size of \({\varvec{\uptheta}}\). Hence:

$$\propto \mathrm{exp}(-\frac{1}{2{\sigma }^{2}}{\left(\mathbf{y}-{\mathbf{X}}^{T}{\varvec{\uptheta}}\right)}^{T}\left(\mathbf{y}-{\mathbf{X}}^{T}{\varvec{\uptheta}}\right)-\frac{1}{2}{{\varvec{\uptheta}}}^{T}{\sum }^{-1}{\varvec{\uptheta}})$$
(23)
$$\propto \mathrm{exp}\left(-\frac{1}{2{\sigma }^{2}}{\mathbf{y}}^{T}\mathbf{y}+\frac{1}{2{\sigma }^{2}}{\mathbf{y}}^{T}\left({\mathbf{X}}^{T}{\varvec{\uptheta}}\right)+\frac{1}{2{\sigma }^{2}}{{\varvec{\uptheta}}}^{T}\mathbf{X}\mathbf{y}-\frac{1}{2{\sigma }^{2}}\mathbf{X}{{\varvec{\uptheta}}}^{T}\left({\mathbf{X}}^{T}{\varvec{\uptheta}}\right)-\frac{1}{2}{{\varvec{\uptheta}}}^{T}{\sum }^{-1}{\varvec{\uptheta}}\right)$$
(24)

By completing the square with respect to \({\varvec{\uptheta}}\), hence:

$$\propto \mathrm{exp}\left(-\frac{1}{2}{{\varvec{\uptheta}}}^{T}\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right){\varvec{\uptheta}}+\frac{1}{{\sigma }^{2}}{{\varvec{\uptheta}}}^{T}\mathbf{X}\mathbf{y}-\frac{1}{2{\sigma }^{2}}{\mathbf{y}}^{T}\mathbf{y}\right).$$
(25)

From Eq. (25), one can obtain:

$$P\left({\varvec{\uptheta}}|\mathbf{X},\mathbf{y};\,M\right)\sim N\left( mean \overline{{\varvec{\uptheta}} }={\sigma }^{2}{({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1})}^{-1}\mathbf{X}\mathbf{y},{cov=\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\right).$$
(26)

Appendix B: Derivation of the distribution estimation for prediction

By re-writing Eq. 6, we have:

$$P\left({\mathbf{y}}^{*}|{\mathbf{X}}^{\boldsymbol{*}},\mathbf{X},\mathbf{y};\,M\right)=\int P\left({\mathbf{y}}^{*}|{\mathbf{X}}^{\boldsymbol{*}},{\varvec{\uptheta}};\,M\right) P\left({\varvec{\uptheta}}|\mathbf{X},\mathbf{y};\,M\right) d{\varvec{\uptheta}}$$
(27)

The marginalisation is carried out as follows (by neglecting the constant term of the Gaussian probability):

$$P\left({\mathbf{y}}^{*}|{\mathbf{X}}^{\boldsymbol{*}},\mathbf{X},\mathbf{y};\,M\right)=\int \mathrm{exp}({\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)}^{T}{\left(cov {\mathbf{y}}^{*}\right)}^{-1}\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)) \mathrm{exp}\left(-\frac{1}{2}{{\varvec{\uptheta}}}^{T}\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right){\varvec{\uptheta}}+\frac{1}{{\sigma }^{2}}{{\varvec{\uptheta}}}^{T}\mathbf{X}\mathbf{y}-\frac{1}{2{\sigma }^{2}}{\mathbf{y}}^{T}\mathbf{y}\right) d{\varvec{\uptheta}}$$
(28)

where \(cov {\mathbf{y}}^{*}={\mathbf{X}}^{*T}{{\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\mathbf{X}}^{*}\) due to the relation to the covariance of the posterior of \({\varvec{\uptheta}}\). By only considering the exponent terms, hence:

$$\leftrightarrow {\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)}^{T}{\left({\mathbf{X}}^{*T}{{\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\mathbf{X}}^{*}\right)}^{-1}\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)-\frac{1}{2}{{\varvec{\uptheta}}}^{T}\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right){\varvec{\uptheta}}+\frac{1}{{\sigma }^{2}}{{\varvec{\uptheta}}}^{T}\mathbf{X}\mathbf{y}-\frac{1}{2{\sigma }^{2}}{\mathbf{y}}^{T}\mathbf{y}$$
(29)

To simplify the form, let \({\mathbf{A}=\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\), \(\mathbf{B}=\frac{1}{{\sigma }^{2}}\mathbf{X}\mathbf{y}\), \(\mathbf{C}={\mathbf{y}}^{T}\mathbf{y}\). Hence,

$$\leftrightarrow {\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)}^{T}{\left({\mathbf{X}}^{*T}{{\left(\mathbf{A}\right)}^{-1}\mathbf{X}}^{*}\right)}^{-1}\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)-\frac{1}{2}{{\varvec{\uptheta}}}^{T}\left(\mathbf{A}\right){\varvec{\uptheta}}+{{\varvec{\uptheta}}}^{T}\mathbf{B}-\frac{1}{2{\sigma }^{2}}\mathbf{C}$$
(30)

Hence, by completing the square with respect to \({\varvec{\uptheta}}\) (marginalisation over \({\varvec{\uptheta}}\)), Eq. (30) becomes:

$$\leftrightarrow -\frac{1}{2}{\left({\varvec{\uptheta}}-{\mathbf{A}}^{-1}\mathbf{B}\right)}^{T}\mathbf{A}\left({\varvec{\uptheta}}-{\mathbf{A}}^{-1}\mathbf{B}\right)+\frac{1}{2}{\mathbf{B}}^{T}{\mathbf{A}}^{-1}\mathbf{B}+{\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)}^{T}{\left({\mathbf{X}}^{*T}{{\left(A\right)}^{-1}\mathbf{X}}^{*}\right)}^{-1}\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)-\frac{1}{2{\sigma }^{2}}C$$
(31)

By inserting Eq. (31) into Eq. (28), after some algebraic calculations, it is obtained:

$$=\sqrt{\left|{\left(2\pi \right)}^{n}{\mathbf{A}}^{-1}\right|}\mathrm{exp}\left({\left({{\varvec{y}}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)}^{T}{\left({\mathbf{X}}^{*T}{{\left(A\right)}^{-1}\mathbf{X}}^{*}\right)}^{-1}\left({{\varvec{y}}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)\right)\mathrm{exp}(\frac{1}{2}{\mathbf{B}}^{T}{\mathbf{A}}^{-1}\mathbf{B}-\frac{1}{2{\sigma }^{2}}C)\int N\left({\varvec{\uptheta}}|{\mathbf{A}}^{-1}\mathbf{B},{\mathbf{A}}^{-1}\right) d{\varvec{\uptheta}}$$
(32)

With respect to \({\mathbf{y}}^{*}\) and by considering other terms without \({\mathbf{y}}^{*}\) as constant, hence:

$$P\left({\mathbf{y}}^{*}|{\mathbf{X}}^{\boldsymbol{*}},\mathbf{X},\mathbf{y};\,M\right)\propto \mathrm{exp}\left({\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)}^{T}{\left({\mathbf{X}}^{*T}{{\left(A\right)}^{-1}\mathbf{X}}^{*}\right)}^{-1}\left({\mathbf{y}}^{*}-{\mathbf{X}}^{*T}\overline{{\varvec{\uptheta}} }\right)\right)$$
(33)

Finally, the prediction distribution is (by writing again \({\mathbf{A}=\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\) and \(\overline{{\varvec{\uptheta}} }={\sigma }^{2}{\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\mathbf{X}\mathbf{y}\)):

$$P\left({\mathbf{y}}^{*}|{\mathbf{X}}^{\boldsymbol{*}},\mathbf{X},\mathbf{y};\,M\right)\sim N({\overline{{\mathbf{y} }^{*}}=\mathbf{X}}^{*T}{\sigma }^{2}{\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\mathbf{X}\mathbf{y},{cov\left(\overline{{\mathbf{y} }^{*}}\right)=\mathbf{X}}^{*T}{{\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\mathbf{X}}^{*})$$
(34)

Appendix C: Derivation of the distribution estimation for prediction with Kernel

As being previously defined, let \({\mathbf{A}=\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\). Subsequently, with matrix algebra, one can obtain:

$${\sigma }^{-2}\mathbf{X}\left({\mathbf{X}}^{T}\sum \mathbf{X}+{\sigma }^{2}\mathbf{I}\right)=\mathbf{A}\sum \mathbf{X}$$
(35)
$${\mathbf{A}}^{-1}\left[{\sigma }^{-2}\mathbf{X}\left({\mathbf{X}}^{T}\sum \mathbf{X}+{\sigma }^{2}\mathbf{I}\right)\right]{\left({\mathbf{X}}^{T}\sum \mathbf{X}+{\sigma }^{2}\mathbf{I}\right)}^{-1}={\mathbf{A}}^{-1}\left[\mathbf{A}\sum \mathbf{X}\right]{\left({\mathbf{X}}^{T}\sum \mathbf{X}+{\sigma }^{2}\mathbf{I}\right)}^{-1}$$
(36)
$${\mathbf{A}}^{-1}{\sigma }^{-2}\mathbf{X}=\sum \mathbf{X}{\left({\mathbf{X}}^{T}\sum \mathbf{X}+{\sigma }^{2}\mathbf{I}\right)}^{-1}$$
(37)

By inserting Eq. (37) into the mean in Eq. (34), the mean becomes:

$${{{\varvec{X}}}^{*T}{\sigma }^{2}{\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\mathbf{X}\mathbf{y}=\mathbf{X}}^{*T}\sum \mathbf{X}\left[{\mathbf{X}}^{T}\sum \mathbf{X}+{\sigma }^{2}\mathbf{I}\right]\mathbf{y}$$
(38)

For the covariance in Eq. (34), with matrix inversion lemma [48], the term \({\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\) can be re-written as (by setting \(\mathbf{Z}={\sum }^{-1}, \mathbf{U}=\mathbf{V}=\mathbf{X}\) and \(W={\sigma }^{-2}\)):

$${\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}=\sum -\sum \mathbf{X}{\left[{\sigma }^{2}+{\mathbf{X}}^{T}\sum \mathbf{X}\right]}^{-1}\sum \mathbf{X}$$
(39)

Hence, the covariance in Eq. (34) becomes:

$${\mathbf{X}}^{*T}{{\left({\sigma }^{-2}\mathbf{X}{\mathbf{X}}^{T}+{\sum }^{-1}\right)}^{-1}\mathbf{X}}^{*}={\mathbf{X}}^{*T}\sum {\mathbf{X}}^{\boldsymbol{*}}-{\mathbf{X}}^{*T}\sum \mathbf{X}\left[{\mathbf{X}}^{T}\sum \mathbf{X}+{\sigma }^{2}\mathbf{I}\right]{\mathbf{X}}^{*T}\sum \mathbf{X}$$
(40)

From Eq. (38) and Eq. (40), all multiplications involving input vector \({\mathbf{X}}^{\boldsymbol{*}}\sum \mathbf{X}\) become \(\langle {\mathbf{X}}^{\boldsymbol{*}},\mathbf{X}\rangle =\mathbf{K}\left({\mathbf{X}}^{\boldsymbol{*}},\mathbf{X}\right)\) and \(\mathbf{X}\sum \mathbf{X}\) become \(\langle \mathbf{X},\mathbf{X}\rangle =\mathbf{K}\left(\mathbf{X},\mathbf{X}\right)\). By this substitution, the input multiplication is carried out in kernel space having a non-linear function with variable \({\mathbf{X}}^{\boldsymbol{*}},\mathbf{X}\) or \(\mathbf{X},\mathbf{X}\) or \({\mathbf{X}}^{\boldsymbol{*}},{\mathbf{X}}^{\boldsymbol{*}}\). Hence, the mean \(\overline{{\mathbf{y} }^{*}}\) and variance of the prediction \(cov\left(\overline{{\mathbf{y} }^{*}}\right)\) become:

$$\overline{{\mathbf{y} }^{*}}=\mathbf{K}\left({\mathbf{X}}^{\boldsymbol{*}},\mathbf{X}\right){[\mathbf{K}(\mathbf{X},\mathbf{X})+{\sigma }^{2}\mathbf{I}]}^{-1}\mathbf{y}$$
(41)
$$cov\left(\overline{{\mathbf{y} }^{*}}\right)=\mathbf{K}\left({\mathbf{X}}^{\boldsymbol{*}},{\mathbf{X}}^{\boldsymbol{*}}\right)-\left(\mathbf{K}\left({\mathbf{X}}^{\boldsymbol{*}},\mathbf{X}\right){\left[\mathbf{K}\left(\mathbf{X},\mathbf{X}\right)+{\sigma }^{2}\mathbf{I}\right]}^{-1}\mathbf{K}\left({\mathbf{X}}^{\boldsymbol{*}},\mathbf{X}\right)\right)$$
(42)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Syam, W.P., Benardos, P., Britchford, E. et al. Outlier removal in biomaterial image segmentations using a non-stationary Bayesian learning. Pattern Anal Applic 24, 1805–1824 (2021). https://doi.org/10.1007/s10044-021-00979-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-021-00979-9

Keywords

Navigation