Skip to main content
Log in

Cross-Entropy Reduction of Data Matrix with Restriction on Information Capacity of the Projectors and Their Norms

  • Published:
Mathematical Models and Computer Simulations Aims and scope

Abstract

We develop/propose the method reducing the dimension of a data matrix, based on its direct and inverse projection, and the calculation of projectors that minimize the cross-entropy functional, remove. We introduce the concept of information capacity of a matrix, which is used as a constraint in the optimal reduction problem, is introduced. The proposed method is compared with known methods in the problem of binary classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

Similar content being viewed by others

Notes

  1. The nonnegativity of matrix Q, generally speaking, is not required for what follows but greatly simplifies the design problem.

REFERENCES

  1. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer, New York, 2009).

    Book  Google Scholar 

  2. K. V. Vorontsov, Mathematical Methods of Precedent-Based Learning: A Course of Lectures (Mosk. Fiz.-Tekh. Inst., Moscow, 2013) [in Russian].

    Google Scholar 

  3. L. van der Maaten, E. Postma, and J. van den Herik, “Dimensionality reduction: A comparative review,” Technical Report TiCC TR 2009-005 (Tilburg University, 2009).

    Google Scholar 

  4. I. K. Fodor, “A survey of dimension reduction techniques,” Technical Report UCRL-ID-148494 (Lawrence Livermore National Lab., 2002). Access mode: https://www.osti.gov/biblio/15002155

  5. A. M. Bruckstein, D. L. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Rev. 51 (1), 34–81 (2009).

    Article  MathSciNet  Google Scholar 

  6. M. G. Kendall and A. Stuart, The Advanced Theory of Statistics, Vol. 2: Inference and Relationship, 2nd ed. (Charles Griffin, London, 1961); Russian translation: M. Kendall and A. Stuart, Statistical Inferences and Relationships (Nauka, Moscow, 1973).

  7. I. T. Jolliffe, Principal Component Analysis (Springer, New York, 1986).

    Book  Google Scholar 

  8. P. Comon and C. Jutten (eds.), Handbook of Blind Source Separation: Independent Component Analysis and Applications (Academic Press, Oxford, 2010).

    Google Scholar 

  9. M. W. Berry, M. Browne, A. N. Langville et al., “Algorithms and applications for approximate nonnegative natrix factorization,” Comput. Stat. Data Anal. 52 (1), 155–173 (2007).

    Article  Google Scholar 

  10. B. T. Polyak and M. V. Khlebnikov, “Principle component analysis: Robust versions,” Autom. Remote Control 78 (3), 490–506 (2017). https://doi.org/10.1134/S0005117917030092

    Article  MathSciNet  MATH  Google Scholar 

  11. E. Bingham and H. Mannila, “Random projection in dimensionality reduction: applications to image and text data,” in Proc. 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD01 (San Francisco, CA, 2001), pp. 245–250 (ACM, New York, 2001).

  12. S. S. Vempala, The Random Projection Method, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 65 (Am. Math. Soc., Providence, RI, 2004).

  13. W. B. Johnson and J. Lindenstrauss, “Extensions of Lipshitz mappings into a Hilbert space,” in Conf. on Modern Analysis and Probability (New Haven, CT, 1982), Contemp. Math. 26, 189–206 (Am. Math. Soc., Providence, RI, 1984).

  14. D. Achlioptas, “Database-friendly random projections,” in Proc. 20th Annual Symposium on Principles of Database Systems PODS01 (Santa Barbara, CA, 2001), pp. 274–281 (Am. Math. Soc., Providence, RI, 2001).

  15. H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Trans. Pattern Anal. Mach. Intell. 27 (8), 1226–1238 (2005).

    Article  Google Scholar 

  16. Y. Zhang, S. Li, T. Wang, and Z. Zhang, “Divergence-based feature selection for separate classes,” Neurocomput. 101, 32–42 (2013).

    Article  Google Scholar 

  17. Yu. S. Popkov, Yu. A. Dubnov, and A. Yu. Popkov, “Entropy dimension reduction method for randomized machine learning problems,” Autom. Remote Control 79 (11), 2038–2051 (2018). https://doi.org/10.1134/S0005117918110085

    Article  MathSciNet  MATH  Google Scholar 

  18. J. R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics (Wiley, New York, 1988).

    MATH  Google Scholar 

  19. B. T. Polyak, Introduction to Optimization (Nauka, Moscow, 1983; Optimization Software, New York, 1987).

  20. A. S. Strekalovskiy, Elements of Nonconvex Optimization (Nauka, Novosibirsk, 2003) [in Russian].

    Google Scholar 

  21. Yu. S. Popkov, Macrosystems Theory. Equilibrium Models (URSS, Moscow, 2012) [in Russian].

    MATH  Google Scholar 

  22. C. M. Bishop, Pattern Recognition and Machine Learning, Information Science and Statistics (Springer, New York, 2006).

  23. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics (Springer, New York, 2001).

    Book  Google Scholar 

  24. K. Q. Weinberger and L. K. Saul, “Unsupervised learning of image manifolds by semidefinite programming,” Int. J. Comput. Vision 70 (1), 77–90 (2006). https://doi.org/10.1007/s11263-005-4939-z

    Article  Google Scholar 

  25. L. K. Saul and S. T. Roweis, “Think globally, fit locally: unsupervised learning of low dimensional manifolds,” J. Mach. Learn. Res. 4, 119–155 (2003).

    MathSciNet  MATH  Google Scholar 

  26. F. Pedregosa, G. Varoquaux, A. Gramfort et al., “Scikit-learn: machine learning in Python,” J. Mach. Learn. Res. 12, 2825–2830 (2011).

    MathSciNet  MATH  Google Scholar 

  27. L. Buitinck, G. Louppe, M. Blondel et al., “API design for machine learning software: experiences from the scikit-learn project,” in European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Proc. ECML PKDD 2013 Workshop: Languages for Data Mining and Machine Learning LML 2013 (Prague, Czech Republic, September 23, 2013), pp. 108–122.

  28. KEEL Dataset repository. https://sci2s.ugr.es/keel/datasets.php (Accessed: 2019-07-03).

  29. D. Kraft, “A software package for sequential quadratic programming,” Report DFVLR-FB 88-28 (DFVLR — Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt, Institut für Dynamik der Flugsysteme, Oberplaffenhofen, Weßling, 1988).

Download references

Funding

This study was supported by the Russian Foundation for Basic Research (projects 17-29-03119, 20-07-00470).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yu. S. Popkov, A. Yu. Popkov or Yu. A. Dubnov.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Popkov, Y.S., Popkov, A.Y. & Dubnov, Y.A. Cross-Entropy Reduction of Data Matrix with Restriction on Information Capacity of the Projectors and Their Norms. Math Models Comput Simul 13, 382–394 (2021). https://doi.org/10.1134/S2070048221030145

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S2070048221030145

Keywords:

Navigation