Abstract
We present a new supervised image classification method applicable to a broad class of image deformation models. The method makes use of the previously described Radon Cumulative Distribution Transform (R-CDT) for image data, whose mathematical properties are exploited to express the image data in a form that is more suitable for machine learning. While certain operations such as translation, scaling, and higher-order transformations are challenging to model in native image space, we show the R-CDT can capture some of these variations and thus render the associated image classification problems easier to solve. The method—utilizing a nearest-subspace algorithm in the R-CDT space—is simple to implement, non-iterative, has no hyper-parameters to tune, is computationally efficient, label efficient, and provides competitive accuracies to state-of-the-art neural networks for many types of classification problems. In addition to the test accuracy performances, we show improvements (with respect to neural network-based methods) in terms of computational efficiency (it can be implemented without the use of GPUs), number of training samples needed for training, as well as out-of-distribution generalization. The Python code for reproducing our results is available at Shifat-E-Rabbi et al. (Python code implementing the Radon cumulative distribution transform subspace model for image classification. https://github.com/rohdelab/rcdt_ns_classifier).
Similar content being viewed by others
Notes
We are using a slightly different definition of the CDT than in [31]. The properties of the CDT outlined here hold in both definitions.
Rigorously speaking, if \(\widehat{\mathbb {V}}^{(p)}\) is a closed subspace, then \(d^2( \widehat{s},\widehat{\mathbb {V}}^{(p)})>0\) if and only if \(\widehat{s}\notin \widehat{\mathbb {V}}^{(p)} \). In practice, \(\widehat{\mathbb {V}}^{(p)}\) will be a finite dimensional space and hence, the closedness condition is satisfied.
The same grid is chosen for all images. m, n are positive integers.
References
Shifat-E-Rabbi, M., Yin, X., Rubaiyat, A.H.M., Li, S., Kolouri, S., Aldroubi, A., Nichols, J.M., Rohde, G.K: Python code implementing the Radon cumulative distribution transform subspace model for image classification. https://github.com/rohdelab/rcdt_ns_classifier
Sertel, O., Kong, J., Shimada, H., Catalyurek, U.V., Saltz, J.H., Gurcan, M.N.: Computer-aided prognosis of neuroblastoma on whole-slide images: classification of stromal development. Pattern Recognit. 42(6), 1093–1103 (2009)
Basu, S., Kolouri, S., Rohde, G.K.: Detecting and visualizing cell phenotype differences from microscopy images using transport-based morphometry. Proc. Natl. Acad. Sci. 111(9), 3448–3453 (2014)
Kundu, S., Kolouri, S., Erickson, K.I., Kramer, A.F., McAuley, E., Rohde, G.K.: Discovery and visualization of structural biomarkers from MRI using transport-based morphometry. Neuroimage 167, 256–275 (2018)
Schulz, J.B., Borkert, J., Wolf, S., Schmitz-Hübsch, T., Rakowicz, M., Mariotti, C., Schoels, L., Timmann, D., Warrenburg, B., Dürr, A., Pandolfo, M., Kang, J., Mandly, A.G., Nagele, T., Grisoli, M., Boguslawska, R., Bauer, P., Klockgether, T., Hauser, T.: Visualization, quantification and correlation of brain atrophy with clinical symptoms in spinocerebellar ataxia types 1, 3 and 6. Neuroimage 49(1), 158–168 (2010)
Hadid, A., Heikkila, J.Y., Silvén, O., Pietikainen, M.: Face and eye detection for person authentication in mobile phones. In: 2007 First ACM/IEEE International Conference on Distributed Smart Cameras, pp. 101–108 (2007)
Shifat-E-Rabbi, M., Yin, X., Fitzgerald, C.E., Rohde, G.K.: Cell image classification: a comparative overview. Cytometry A 97A(4), 347–362 (2020)
Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29(9), 2352–2449 (2017)
Lu, D., Weng, Q.: A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 28(5), 823–870 (2007)
Prewitt, J.M.S., Mendelsohn, M.L.: The analysis of cell images. Ann. N. Y. Acad. Sci. 128(3), 1035–1053 (1966)
Orlov, N., Shamir, L., Macura, T., Johnston, J., Eckley, D.M., Goldberg, I.G.: WND-CHARM: multi-purpose image classification using compound image transforms. Pattern Recognit. Lett. 29(11), 1684–1693 (2008)
Ponomarev, G.V., Arlazarov, V.L., Gelfand, M.S., Kazanov, M.D.: Ana hep-2 cells image classification using number, size, shape and localization of targeted cell regions. Pattern Recognit. 47(7), 2360–2366 (2014)
Bandos, T.V., Bruzzone, L., Camps-Valls, G.: Classification of hyperspectral images with regularized linear discriminant analysis. IEEE Trans. Geosci. Remote Sens. 47(3), 862–873 (2009)
Muldoon, T.J., Thekkek, N., Roblyer, D.M., Maru, D., Harpaz, N., Potack, J., Anandasabapathy, S., Richards-Kortum, R.R.: Evaluation of quantitative image analysis criteria for the high-resolution microendoscopic detection of neoplasia in Barrett’s esophagus. J. Biomed. Opt. 15(2), 026027 (2010)
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vis. 73(2), 213–238 (2007)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer Vision, pp. 143–156 (2010)
Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)
Du, P., Samat, A., Waske, B., Liu, S., Li, Z.: Random forest and rotation forest for fully polarized SAR image classification using polarimetric and spatial features. ISPRS J. Photogramm. Remote Sens. 105, 38–53 (2015)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Shin, H.-C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Wolberg, G.: Image morphing: a survey. Vis. Comput. 14(8), 360–372 (1998)
Kolouri, S., Park, S.R., Rohde, G.K.: The radon cumulative distribution transform and its application to image classification. IEEE Trans. Image Process. 25(2), 920–934 (2016)
Kolouri, S., Park, S.R., Thorpe, M., Slepcev, D., Rohde, G.K.: Optimal mass transport: signal processing and machine-learning applications. IEEE Signal Process. Mag. 34(4), 43–59 (2017)
Villani, C.: Optimal Transport: Old and New, vol. 338. Springer, Berlin (2008)
Wang, W., Slepčev, D., Basu, S., Ozolek, J.A., Rohde, G.K.: A linear optimal transportation framework for quantifying and visualizing variations in sets of images. Int. J. Comput. Vis. 101(2), 254–269 (2013)
Kolouri, S., Zou, Y., Rohde, G.K.: Sliced Wasserstein kernels for probability distributions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5258–5267 (2016)
Park, S.R., Cattell, L., Nichols, J.M., Watnik, A., Doster, T., Rohde, G.K.: De-multiplexing vortex modes in optical communications using transport-based pattern recognition. Opt. Express 26(4), 4004–4022 (2018)
Fitzgerald, C.E., Cattell, L., Rohde, G.K.: Training classifiers with limited data using the Radon cumulative distribution transform. Med. Imaging Image Process. 10574, 105742 (2018)
Park, S.R., Kolouri, S., Kundu, S., Rohde, G.K.: The cumulative distribution transform and linear pattern classification. Appl. Comput. Harmon. Anal. 45(3), 616–641 (2018)
Bracewell, R.N.: The Fourier Transform and Its Applications, vol. 31999. McGraw-Hill, New York (1986)
Yang, I.: A convex optimization approach to distributionally robust Markov decision processes with Wasserstein distance. IEEE Control Syst. Lett. 1(1), 164–9 (2017)
Quinto, E.T.: An introduction to x-ray tomography and radon transforms. In: Proceedings of Symposia in Applied Mathematics, vol. 63, p. 1 (2006)
Natterer, F.: The Mathematics of Computerized Tomography. SIAM, Philadelphia (2001)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv:1412.6980
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–30 (2011)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2005)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
Lee, G.R., Gommers, R., Waselewski, F., Wohlfahrt, K., O’Leary, A.: PyWavelets: a Python package for wavelet analysis. J. Open Source Softw. 4(36), 1237 (2019)
Kaggle: Sign Language MNIST. https://www.kaggle.com/datamunge/sign-language-mnist. Accessed 10 Mar 2020
Vondrick, C., Khosla, A., Malisiewicz, T., Torralba, A.: Hoggles: Visualizing object detection features. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1–8 (2013)
Marcus, D.S., Wang, T.H., Parker, J., Csernansky, J.G., Morris, J.C., Buckner, R.L.: Open access series of imaging studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J. Cogn. Neurosci. 19(9), 1498–1507 (2007)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)
Gardner, M.W., Dorling, S.R.: Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos. Environ. 32(14–15), 2627–2636 (1998)
Pampel, F.C.: Logistic Regression: A Primer. SAGE Publications Incorporated, Thousand Oaks (2020)
Rubaiyat, A.H., Hallam, K.M., Nichols, J.M., Hutchinson, M.N., Li, S., Rohde, G.K.: Parametric signal estimation using the cumulative distribution transform. IEEE Trans. Signal Process. 68, 3312–24 (2020)
Nichols, J.M., Emerson, T.H., Cattell, L., Park, S., Kanaev, A., Bucholtz, F., Watnik, A., Doster, T., Rohde, G.K.: Transport-based model for turbulence-corrupted imagery. Appl. Opt. 57(16), 4524–36 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by NIH Grants GM130825, GM090033.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Shifat-E-Rabbi, M., Yin, X., Rubaiyat, A.H.M. et al. Radon Cumulative Distribution Transform Subspace Modeling for Image Classification. J Math Imaging Vis 63, 1185–1203 (2021). https://doi.org/10.1007/s10851-021-01052-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-021-01052-0