Skip to main content
Log in

A linear relation between input and first layer in neural networks

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

Artificial neural networks grow on the number of applications and complexity, which require a minimization on the number of units for some practical implementations. A particular problem is the minimum number of units that a feed forward neural network needs on its first layer. In order to study this problem, it is defined a family of classification problems following a continuity hypothesis, where inputs that are close to some set of points may share the same category. Given a set S of k −dimensional inputs and let \(\mathcal {N}\) be a feed forward neural network that classifies any input in S within a fixed error, there is proved that \(\mathcal {N}\) requires \({\Theta } \left (k \right )\) units in the first layer, if \(\mathcal {N}\) can solve any instance from the given family of classification problems. Furthermore, this asymptotic result is optimal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anthony, M.: Boolean Functions and Artificial Neural Networks. Centre for Discrete and Applicable Mathematics, London School of Economics and Political Science (2003)

  2. Baum, E.B., Haussler, D.: What Size Net Gives Valid Generalization? Computer Research Laboratory. University of California, Santa Cruz (1988)

  3. Beiu, V.: A survey of perceptron circuit complexity results. In: Proceedings of the International Joint Conference on Neural Networks, 2003, vol. 2, pp 989–994. IEEE (2003)

  4. Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  5. Bengio, Y, et al.: Learning deep architectures for ai. Foundations and trends®; in Machine Learning 2(1), 1–127 (2009)

    Article  MathSciNet  Google Scholar 

  6. Bollobás, B.: Modern Graph Theory, vol. 184. Springer Science & Business Media (2013)

  7. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT press (2009)

  8. Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., et al.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2012)

  9. Håstad, J.: On the size of weights for threshold gates. SIAM J. Discret. Math. 7(3), 484–492 (1994)

    Article  MathSciNet  Google Scholar 

  10. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  11. Horne, B.G., Hush, D.R.: On the node complexity of neural networks. Neural Netw. 7(9), 1413–1426 (1994)

    Article  Google Scholar 

  12. Jain, A.K., Narasimha Murty, M., Flynn, P.J.: Data clustering: A review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)

    Article  Google Scholar 

  13. Kurzweil, R.: The Singularity is Near: When Humans Transcend Biology. Penguin (2005)

  14. Lawrence, S., Lee Giles, C., Chung Tsoi, A., Back, A.D.: Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)

    Article  Google Scholar 

  15. Le, Q.V.: Building high-level features using large scale unsupervised learning. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 8595–8598. IEEE (2013)

  16. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  17. Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for sequence learning. arXiv:1506.00019 (2015)

  18. Mhaskar, H.N, Poggio, T: Deep vs. shallow networks: An approximation theory perspective. Anal. Appl. 14(06), 829–848 (2016)

    Article  MathSciNet  Google Scholar 

  19. Rojas, R.: Neural Networks: A Systematic Introduction, pp. 143. Springer Science & Business Media (2013)

  20. Simard, P.Y., Steinkraus, D., Platt, J.C., et al.: Best practices for convolutional neural networks applied to visual document analysis. In: ICDAR, vol. 3, pp. 958–962 (2003)

  21. Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)

    Article  Google Scholar 

  22. Vapnik, V.: The Nature of Statistical Learning Theory. Springer science & business media (2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastián A. Grillo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grillo, S.A. A linear relation between input and first layer in neural networks. Ann Math Artif Intell 87, 361–372 (2019). https://doi.org/10.1007/s10472-019-09657-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-019-09657-3

Keywords

Mathematics Subject Classification (2010)

Navigation