Deep Neural Network Structures Solving Variational Inequalities

Combettes, Patrick L.; Pesquet, Jean-Christophe

doi:10.1007/s11228-019-00526-z

Deep Neural Network Structures Solving Variational Inequalities

Published: 13 February 2020

Volume 28, pages 491–518, (2020)
Cite this article

Set-Valued and Variational Analysis Aims and scope Submit manuscript

Patrick L. Combettes¹ &
Jean-Christophe Pesquet²

692 Accesses
58 Citations
Explore all metrics

Abstract

Motivated by structures that appear in deep neural networks, we investigate nonlinear composite models alternating proximity and affine operators defined on different spaces. We first show that a wide range of activation operators used in neural networks are actually proximity operators. We then establish conditions for the averagedness of the proposed composite constructs and investigate their asymptotic properties. It is shown that the limit of the resulting process solves a variational inequality which, in general, does not derive from a minimization problem. The analysis relies on tools from monotone operator theory and sheds some light on a class of neural networks structures with so far elusive asymptotic properties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bolstering stochastic gradient descent with model building

Article Open access 15 April 2024

Siamese Neural Networks: An Overview

Fundamentals of Artificial Neural Networks and Deep Learning

References

Aragón Artacho, F.J., Campoy, R.: A new projection method for finding the closest point in the intersection of convex sets. Comput. Optim. Appl. 69, 99–132 (2018)
Article MathSciNet Google Scholar
Attouch, H., Peypouquet, J., Redont, P.: Backward-forward algorithms for structured monotone inclusions in Hilbert spaces. J. Math. Anal. Appl. 457, 1095–1117 (2018)
Article MathSciNet Google Scholar
Baillon, J.-B., Bruck, R.E., Reich, S.: On the asymptotic behavior of nonexpansive mappings and semigroups in Banach spaces. Houston J. Math. 4, 1–9 (1978)
MathSciNet MATH Google Scholar
Baillon, J.-B., Combettes, P.L., Cominetti, R.: There is no variational characterization of the cycles in the method of periodic projections. J. Funct. Anal. 262, 400–408 (2012)
Article MathSciNet Google Scholar
Bargetz, C., Reich, S., Zalas, R.: Convergence properties of dynamic string-averaging projection methods in the presence of perturbations. Numer. Algor. 77, 185–209 (2018)
Article MathSciNet Google Scholar
Barron, A.R.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39, 930–941 (1993)
Article MathSciNet Google Scholar
Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38, 367–426 (1996)
Article MathSciNet Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)
Bauschke, H.H., Noll, D., Phan, H.M.: Linear and strong convergence of algorithms involving averaged nonexpansive operators. J. Math. Anal. Appl. 421, 1–20 (2015)
Article MathSciNet Google Scholar
Bilski, J.: The backpropagation learning with logarithmic transfer function. In: Proc. 5th Conf. Neural Netw. Soft Comput., pp. 71–76 (2000)
Borwein, J.M., Li, G., Tam, M.K.: Convergence rate analysis for averaged fixed point iterations in common fixed point problems. SIAM J. Optim. 27, 1–33 (2017)
Article MathSciNet Google Scholar
Borwein, J., Reich, S., Shafrir, I.: Krasnoselski-Mann iterations in normed spaces. Canad. Math. Bull. 35, 21–28 (1992)
Article MathSciNet Google Scholar
Boţ, R.I., Csetnek, E.R.: A dynamical system associated with the fixed points set of a nonexpansive operator. J. Dynam. Diff. Equ. 29, 155–168 (2017)
Article MathSciNet Google Scholar
Bravo, M., Cominetti, R.: Sharp convergence rates for averaged nonexpansive maps. Israel J. Math. 227, 163–188 (2018)
Article MathSciNet Google Scholar
Bridle, J.S.: Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Neurocomputing, NATO ASI Series, Series F, vol. 68, pp 227–236. Springer, Berlin (1990)
Carlile, B., Delamarter, G., Kinney, P., Marti, A., Whitney, B.: Improving deep learning by inverse square root linear units (ISRLUs). https://arxiv.org/abs/1710.09967 (2017)
Cegielski, A.: Iterative Methods for Fixed Point Problems in Hilbert Spaces. Lecture Notes in Mathematics, vol. 2057. Springer, Heidelberg (2012)
Google Scholar
Censor, Y., Mansour, R.: New Douglas–Rachford algorithmic structures and their convergence analyses. SIAM J. Optim. 26, 474–487 (2016)
Article MathSciNet Google Scholar
Combettes, P.L.: Construction d’un point fixe commun à une famille de contractions fermes. C. R. Acad. Sci. Paris Sér. I Math., 320, 1385–1390 (1995)
MathSciNet MATH Google Scholar
Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53, 475–504 (2004)
Article MathSciNet Google Scholar
Combettes, P.L.: Monotone operator theory in convex optimization. Math. Programming B170, 177–206 (2018)
Article MathSciNet Google Scholar
Combettes, P.L., Pesquet, J.-C.: Proximal thresholding algorithm for minimization over orthonormal bases. SIAM J. Optim. 18, 1351–1376 (2007)
Article MathSciNet Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)
Article MathSciNet Google Scholar
Combettes, P.L., Yamada, I.: Compositions and convex combinations of averaged nonexpansive operators. J. Math. Anal. Appl. 425, 55–70 (2015)
Article MathSciNet Google Scholar
Condat, L.: A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 158, 460–479 (2013)
Article MathSciNet Google Scholar
Cybenko, G.: Approximation by superposition of sigmoidal functions. Math. Control Signals Syst. 2, 303–314 (1989)
Article MathSciNet Google Scholar
Eckstein, J., Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)
Article MathSciNet Google Scholar
Elliot, D.L.: A better activation function for artificial neural networks, Institute for Systems Research, University of Maryland, Tech. Rep., pp. 93–8 (1993)
Funahashi, K.-I.: On the approximate realization of continuous mappings by neural networks. Neural Netw. 2, 183–192 (1989)
Article Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proc. 14th Int. Conf. Artificial Intell. Stat., pp. 315–323 (2011)
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Pearson Education, Singapore (1998)
MATH Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proc. Int. Conf. Comput. Vision, pp. 1026–1034 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. IEEE Conf. Comput. Vision Pattern Recogn., pp. 770–778 (2016)
LeCun, Y.A., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Article Google Scholar
LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backprop. Lect. Notes Comput. Sci. 1524, 9–50 (1998)
Article Google Scholar
Martinet, B.: Détermination approchée d’un point fixe d’une application pseudo-contractante. Cas de l’application prox. C. R. Acad. Sci. Paris A274, 163–165 (1972)
MATH Google Scholar
McCulloch, W.S., Pitts, W.H.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943)
Article MathSciNet Google Scholar
Moursi, W.M.: The forward-backward algorithm and the normal problem. J. Optim. Theory Appl. 176, 605–624 (2018)
Article MathSciNet Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proc. 27st Int. Conf. Machine Learn., pp. 807–814 (2010)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Book Google Scholar
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976)
Article MathSciNet Google Scholar
Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Rev. 65, 386–408 (1958)
Article Google Scholar
Ryu, E.K., Hannah, R., Yin, W.: Scaled relative graph: Nonexpansive operators via 2D Euclidean geometry. https://arxiv.org/abs/1902.09788
Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. Proc. Neural Inform. Process. Syst. Conf. 28, 2377–2385 (2015)
Google Scholar
Tariyal, S., Majumdar, A., Singh, R., Vatsa, M.: Deep dictionary learning. IEEE Access 4, 10096–10109 (2016)
Article Google Scholar
Tseng, P.: On the convergence of products of firmly nonexpansive mappings. SIAM J. Optim. 2, 425–434 (1992)
Article MathSciNet Google Scholar
Yamagishi, M., Yamada, I.: Nonexpansiveness of a linearized augmented Lagrangian operator for hierarchical convex optimization. Inverse Problems, vol. 33, art. 044003, 35 pp. (2017)
Zhang, X.-P.: Thresholding neural network for adaptive noise reduction. IEEE Trans. Neural Netw. 12, 567–584 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, North Carolina State University, Raleigh, NC, 27695-8205, USA
Patrick L. Combettes
CentraleSupélec, Center for Visual Computing, OPIS Inria Project Team, Université Paris-Saclay, 91190, Gif sur Yvette, France
Jean-Christophe Pesquet

Authors

Patrick L. Combettes
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Christophe Pesquet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrick L. Combettes.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work of P. L. Combettes was supported by the National Science Foundation under grant CCF-1715671. The work of J.-C. Pesquet was supported by Institut Universitaire de France.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Combettes, P.L., Pesquet, JC. Deep Neural Network Structures Solving Variational Inequalities. Set-Valued Var. Anal 28, 491–518 (2020). https://doi.org/10.1007/s11228-019-00526-z

Download citation

Received: 25 April 2019
Accepted: 09 December 2019
Published: 13 February 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s11228-019-00526-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Neural Network Structures Solving Variational Inequalities

Abstract

Access this article

Similar content being viewed by others

Bolstering stochastic gradient descent with model building

Siamese Neural Networks: An Overview

Fundamentals of Artificial Neural Networks and Deep Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep Neural Network Structures Solving Variational Inequalities

Abstract

Access this article

Similar content being viewed by others

Bolstering stochastic gradient descent with model building

Siamese Neural Networks: An Overview

Fundamentals of Artificial Neural Networks and Deep Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation