Abstract
We develop/propose the method reducing the dimension of a data matrix, based on its direct and inverse projection, and the calculation of projectors that minimize the cross-entropy functional, remove. We introduce the concept of information capacity of a matrix, which is used as a constraint in the optimal reduction problem, is introduced. The proposed method is compared with known methods in the problem of binary classification.
Similar content being viewed by others
Notes
The nonnegativity of matrix Q, generally speaking, is not required for what follows but greatly simplifies the design problem.
REFERENCES
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer, New York, 2009).
K. V. Vorontsov, Mathematical Methods of Precedent-Based Learning: A Course of Lectures (Mosk. Fiz.-Tekh. Inst., Moscow, 2013) [in Russian].
L. van der Maaten, E. Postma, and J. van den Herik, “Dimensionality reduction: A comparative review,” Technical Report TiCC TR 2009-005 (Tilburg University, 2009).
I. K. Fodor, “A survey of dimension reduction techniques,” Technical Report UCRL-ID-148494 (Lawrence Livermore National Lab., 2002). Access mode: https://www.osti.gov/biblio/15002155
A. M. Bruckstein, D. L. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Rev. 51 (1), 34–81 (2009).
M. G. Kendall and A. Stuart, The Advanced Theory of Statistics, Vol. 2: Inference and Relationship, 2nd ed. (Charles Griffin, London, 1961); Russian translation: M. Kendall and A. Stuart, Statistical Inferences and Relationships (Nauka, Moscow, 1973).
I. T. Jolliffe, Principal Component Analysis (Springer, New York, 1986).
P. Comon and C. Jutten (eds.), Handbook of Blind Source Separation: Independent Component Analysis and Applications (Academic Press, Oxford, 2010).
M. W. Berry, M. Browne, A. N. Langville et al., “Algorithms and applications for approximate nonnegative natrix factorization,” Comput. Stat. Data Anal. 52 (1), 155–173 (2007).
B. T. Polyak and M. V. Khlebnikov, “Principle component analysis: Robust versions,” Autom. Remote Control 78 (3), 490–506 (2017). https://doi.org/10.1134/S0005117917030092
E. Bingham and H. Mannila, “Random projection in dimensionality reduction: applications to image and text data,” in Proc. 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD’01 (San Francisco, CA, 2001), pp. 245–250 (ACM, New York, 2001).
S. S. Vempala, The Random Projection Method, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 65 (Am. Math. Soc., Providence, RI, 2004).
W. B. Johnson and J. Lindenstrauss, “Extensions of Lipshitz mappings into a Hilbert space,” in Conf. on Modern Analysis and Probability (New Haven, CT, 1982), Contemp. Math. 26, 189–206 (Am. Math. Soc., Providence, RI, 1984).
D. Achlioptas, “Database-friendly random projections,” in Proc. 20th Annual Symposium on Principles of Database Systems PODS’01 (Santa Barbara, CA, 2001), pp. 274–281 (Am. Math. Soc., Providence, RI, 2001).
H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Trans. Pattern Anal. Mach. Intell. 27 (8), 1226–1238 (2005).
Y. Zhang, S. Li, T. Wang, and Z. Zhang, “Divergence-based feature selection for separate classes,” Neurocomput. 101, 32–42 (2013).
Yu. S. Popkov, Yu. A. Dubnov, and A. Yu. Popkov, “Entropy dimension reduction method for randomized machine learning problems,” Autom. Remote Control 79 (11), 2038–2051 (2018). https://doi.org/10.1134/S0005117918110085
J. R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics (Wiley, New York, 1988).
B. T. Polyak, Introduction to Optimization (Nauka, Moscow, 1983; Optimization Software, New York, 1987).
A. S. Strekalovskiy, Elements of Nonconvex Optimization (Nauka, Novosibirsk, 2003) [in Russian].
Yu. S. Popkov, Macrosystems Theory. Equilibrium Models (URSS, Moscow, 2012) [in Russian].
C. M. Bishop, Pattern Recognition and Machine Learning, Information Science and Statistics (Springer, New York, 2006).
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics (Springer, New York, 2001).
K. Q. Weinberger and L. K. Saul, “Unsupervised learning of image manifolds by semidefinite programming,” Int. J. Comput. Vision 70 (1), 77–90 (2006). https://doi.org/10.1007/s11263-005-4939-z
L. K. Saul and S. T. Roweis, “Think globally, fit locally: unsupervised learning of low dimensional manifolds,” J. Mach. Learn. Res. 4, 119–155 (2003).
F. Pedregosa, G. Varoquaux, A. Gramfort et al., “Scikit-learn: machine learning in Python,” J. Mach. Learn. Res. 12, 2825–2830 (2011).
L. Buitinck, G. Louppe, M. Blondel et al., “API design for machine learning software: experiences from the scikit-learn project,” in European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Proc. ECML PKDD 2013 Workshop: Languages for Data Mining and Machine Learning LML 2013 (Prague, Czech Republic, September 23, 2013), pp. 108–122.
KEEL Dataset repository. https://sci2s.ugr.es/keel/datasets.php (Accessed: 2019-07-03).
D. Kraft, “A software package for sequential quadratic programming,” Report DFVLR-FB 88-28 (DFVLR — Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt, Institut für Dynamik der Flugsysteme, Oberplaffenhofen, Weßling, 1988).
Funding
This study was supported by the Russian Foundation for Basic Research (projects 17-29-03119, 20-07-00470).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Popkov, Y.S., Popkov, A.Y. & Dubnov, Y.A. Cross-Entropy Reduction of Data Matrix with Restriction on Information Capacity of the Projectors and Their Norms. Math Models Comput Simul 13, 382–394 (2021). https://doi.org/10.1134/S2070048221030145
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S2070048221030145