Theoretical Notes on Unsupervised Learning in Deep Neural Networks

Vladimir Golovko, Aliaksandr Kroshchanka

2016

Abstract

Over the last decade the deep neural networks are the powerful tool in the domain of machine learning. The important problem is training of deep neural network, because learning of such a network is much complicated compared to shallow neural networks. This is due to the vanishing gradient problem, poor local minima and unstable gradient problem. Therefore a lot of deep learning techniques were developed that permit us to overcome some limitations of conventional training approaches. In this paper we investigate the unsupervised learning in deep neural networks. We have proved that maximization of the log-likelihood input data distribution of restricted Boltzmann machine is equivalent to minimizing the cross-entropy and to special case of minimizing the mean squared error. The main contribution of this paper is a novel view and new understanding of an unsupervised learning in deep neural networks.

References

  1. Hinton, G., Osindero, S., Teh, Y., 2006. A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527-1554.
  2. Hinton, G., 2002. Training products of experts by minimizing contrastive divergence. Neural Computation, 14, 1771-1800.
  3. Hinton, G., Salakhutdinov, R., 2006. Reducing the dimensionality of data with neural networks. Science, 313 (5786), 504-507.
  4. Hinton, G. E., 2010. A practical guide to training restricted Boltzmann machines. (Tech. Rep. 2010-000). Toronto: Machine Learning Group, University of Toronto.
  5. Krizhevsky, A., Sutskever, L., Hinton, G., 2012. ImageNet classification with deep convolutional neural networs. In Proc. Advances in Neural information Processing Systems, 25, 1090-1098.
  6. LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning Nature, 521 (7553), 436-444.
  7. Mikolov, T, Deoras, A., Povey, D., Burget, L., Cernocky, J., 2011. Strategies for training large scale neural network language models. In Automatic Speech Recognition and Understanding, 195-201.
  8. Hinton, G. at al., 2012. Deep neural network for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29, 82-97.
  9. Bengio, Y., 2009. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), 1- 127.
  10. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., 2007. Greedy layer-wise training of deep networks. In B. Sch\"olkopf, J. C. Platt, T. Hoffman (Eds.), Advances in neural information processing systems, 11, pp. 153-160. MA: MIT Press, Cambridge Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S., 2010. Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 11:625-660.
  11. Larochelle H., Bengio Y., Louradour J., Lamblin P., 2009 Exploring strategies for training deep neural networks//Journal of Machine Learning Research 1, 1- 40.
  12. Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning a review and new percpectives. IEEE Trans. Pattern Anal. Machine Intell. 35, 1798-1828.
  13. Glorot, X., Bordes, A., & Bengio, Y., 2011. Deep sparse rectifier networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume (Vol. 15, pp. 315- 323).
  14. Golovko, V., Kroshchanka A., Rubanau U., Jankowski S., 2014. A Learning Technique for Deep Belief Neural Networks. In book Neural Networks and Artificial Intelligence, Springer, 2014. - Vol. 440. Communication in Computer and Information Science. - P. 136-146.
  15. Golovko, V., Kroshchanka, A., Turchenko, V., Jankowski, S., Treadwell, D., 2015. A New Technique for Restricted Boltzmann Machine Learning. Proceedings of the 8th IEEE International Conference IDAACS2015, Warsaw 24-26 September 2015. - Warsaw, 2015 -P.182-186.
  16. Golovko, V., From multilayers perceptrons to deep belief neural networks: training paradigms and application, Lections on Neuroinformatics, Golovko, V.A., Ed., Moscow: NRNU MEPhI, 2015, pp. 47-84 [in Russian].
  17. Golik, P. Cross-Entropy vs. Squared Error Training: a Theoretical and Experimental Comparison / P. Golik, P. Doetsch, H. Ney // In Interspeech. - Lyon, France, 2013. - P. 1756-1760.
  18. Glorot, X. and Bengio, Y.. 2010. Understanding the difficulty of training deep feed-forward neural networks. in Proc. of Int. Conf. on Artificial Intelligence and Statistics, vol. 9, Chia Laguna Resort, Italy, 2010, pp. 249-256.
Download


Paper Citation


in Harvard Style

Golovko V. and Kroshchanka A. (2016). Theoretical Notes on Unsupervised Learning in Deep Neural Networks . In Proceedings of the 8th International Joint Conference on Computational Intelligence - Volume 2: NCTA, (IJCCI 2016) ISBN 978-989-758-201-1, pages 91-96. DOI: 10.5220/0006084300910096


in Bibtex Style

@conference{ncta16,
author={Vladimir Golovko and Aliaksandr Kroshchanka},
title={Theoretical Notes on Unsupervised Learning in Deep Neural Networks},
booktitle={Proceedings of the 8th International Joint Conference on Computational Intelligence - Volume 2: NCTA, (IJCCI 2016)},
year={2016},
pages={91-96},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006084300910096},
isbn={978-989-758-201-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Computational Intelligence - Volume 2: NCTA, (IJCCI 2016)
TI - Theoretical Notes on Unsupervised Learning in Deep Neural Networks
SN - 978-989-758-201-1
AU - Golovko V.
AU - Kroshchanka A.
PY - 2016
SP - 91
EP - 96
DO - 10.5220/0006084300910096