Restricted Boltzmann Machines: A Review
Author:
Affiliation:

Clc Number:

TP181

Fund Project:

National Natural Science Foundation of China (61672522, 61379101); National Key Basic Research Program of China (973) (2013CB329502); Postgraduate Research & Practice Innovation Program of China University of Mining Technology (KYCX19_2166); Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX19_2166)

  • Article
  • | |
  • Metrics
  • |
  • Reference [73]
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    The Probabilistic graph is a research hotspot in machine learning at present. Generative models based on probabilistic graphs model have been widely used in image generation and speech processing. The restricted Boltzmann machines (RBMs) is a probabilistic undirected graph, which has important research value in modeling data distribution. On the one hand, the RBMs model can be used to construct deep neural network, and on the other hand, it can provide statistical support of deep nets. This paper mainly summarizes the related research of RBMs based probability graph model and their applications in image recognition. Firstly, this paper introduces the basic concepts and training algorithms of RBMs. Secondly, this paper summarizes the applications of RBMs in deep learning; and then, this paper discusses existing problems in research of neural nets and RBMs. Finally, this paper gives a summary and prospect of the research on the RBMs.

    Reference
    [1] Koller D, Friedman N. Probabilistic Graphical Models:Principles and Techniques-Adaptive Computation and Machine Learning. MIT Press, 2009.
    [2] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science, 2006,313(5786):504-507.
    [3] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Computation, 2006,18:1527-1554.
    [4] Hinton GE. Products of experts. In:Proc. of the Int'l Conf. on Artificial Neural Networks. 1999,1:1-6.
    [5] Hinton GE. A practical guide to training restricted Boltzmann machines. In:Neural Networks:Tricks of the Trade. Berlin, Heidelberg:Springer-Verlag, 2012. 599-619.
    [6] Ravanbakhsh S. Learning in Markov random fields using tempered transitions. In:Advances in Neural Information Processing Systems. 2009. 1598-1606.
    [7] Welling M, Rosen-Zvi M, Hinton G. Exponential family harmoniums with an application to information retrieval. In:Proc. of the Int'l Conf. on Neural Information Processing Systems. 2004. 1481-1488.
    [8] Ravanbakhsh S, Poczos B, Schneider J, et al. Stochastic neural networks with monotonic activation functions. arXiv:1601.00034v4, 2016. 573-577.
    [9] Osindero S, Hinton G. Modeling image patches with a directed hierarchy of Markov random fields. In:Proc. of the Int'l Conf. on Neural Information Processing Systems. 2007. 1121-1128.
    [10] Larochelle H, Bengio Y, Louradour J, et al. Exploring strategies for training deep neural networks. Journal of Machine Learning Research, 2009,1(10):1-40.
    [11] Salakhutdinov R, Hinton GE. Deep Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics. 2009. 448-455.
    [12] Salakhutdinov R, Hinton GE. An efficient learning procedure for deep Boltzmann machines. Neural Computation, 2012,24(8):1967-2006.
    [13] Hinton GE, Salakhutdinov R. A better way to pretrain deep Boltzmann machines. In:Advances in Neural Information Processing Systems. 2012,3:2447-2455.
    [14] Goodfellow I, Mirza M, Courville A, Bengio Y. Multi-prediction deep Boltzmann machines. In:Advances in Neural Information Processing Systems. 2013. 548-556.
    [15] Hinton GE. Training products of experts by minimizing contrastive divergence. Neural Computation, 2002,14(8):1711-1800.
    [16] Jordan MI, Ghahramani Z, Jaakkola TS, et al. An introduction to variational methods for graphical models. Machine Learning, 1999,37(2):183-233.
    [17] Geman S, Geman D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1984,6(6):721-741.
    [18] Tieleman T. Training restricted Boltzmann machines using approximations to the likelihood gradient. In:Proc. of the Int'l Conf. on Machine Learning. Helsinki, 2008. 1064-1071.
    [19] Tieleman T, Hinton GE. Using fast weights to improve persistent contrastive divergence. In:Proc. of the Annual Int'l Conf. on Machine Learning. Montreal, 2009. 1033-1040.
    [20] Desjardins G, Courville A, Bengio Y, et al. Tempered Markov chain Monte Carlo for training of restricted Boltzmann machines. Technical Reprot, 1345, University of Montreal, 2009.
    [21] Desjardins G, Courville A, Bengio Y, et al. Parallel tempering for training of restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics. 2010. 145-152.
    [22] Cho KH, Raiko T, Ilin A. Parallel tempering is efficient for learning restricted Boltzmann machines. In:Proc. of the Int'l Joint Conf. on Neural Networks. 2010. 605-616.
    [23] Salakhutdinov R. Learning in Markov random fields using tempered transitions. In:Advances in Neural Information Processing Systems. 2009. 1598-1606.
    [24] Welling M, Hinton GE. A new learning algorithm for mean field Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Neural Networks. Springer-Verlag, 2002. 351-357.
    [25] Montavon G, Müller K, Cuturi M. Wasserstein training of restricted Boltzmann machines. In:Advances in Neural Information Processing Systems. 2017.
    [26] Fisher C, Smith A, Walsh J. Boltzmann encoded adversarial machines. arXiv:1804.08682, 2018.
    [27] Krizhevsky A. Learning multiple layers of features from tiny images[MS. Thesis]. Department of Computer Science, University of Toronto, 2009.
    [28] Cho KH, Ilin A, Raiko T. Improved learning of Gaussian-Bernoulli restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Neural Networks. Berlin, Heidelberg:Springer-Verlag, 2011. 10-17.
    [29] Ranzato M, Krizhevsky A, Hinton GE. Factored 3-way restricted Boltzmann machines for modeling natural images. Journal of Machine Learning Research, 2010,9:621-628.
    [30] Ranzato M, Hinton GE. Modeling pixel means and covariances using factorized third-order Boltzmann machines. In:Proc. of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. 2010. 2551-2558.
    [31] Courville A, Bergstra J, Bengio Y. A Spike and Slab restricted Boltzmann machine. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics (AISTATS). Fort Lauderdale, 2011. 233-241.
    [32] Courville AC, Bergstra J, Bengio Y. Unsupervised models of images by Spike and-Slab RBMs. In:Proc. of the Int'l Conf. on Machine Learning. Washington, 2011. 1145-1152.
    [33] Goodfellow IJ, Courville A, Bengio Y. Spike-and-Slab sparse coding for unsupervised feature discovery. arXiv Preprint arXiv:1201.3382, 2012.
    [34] Huang. H, Toyoizumi. T. Advanced mean-field theory of the restricted Boltzmann machine. Physical Review E Statistical Nonlinear & Soft Matter Physics, 2015,91(5).
    [35] Goodfellow IJ, Courville A, Bengio Y. Large-scale feature learning with Spike-and-Slab sparse coding. In:Proc. of the Int'l Conf. on Machine Learning. Edinburgh, 2012.
    [36] Courville A, Desjardins G, Bergstra J, et al. The Spike-and-Slab RBM and extensions to discrete and sparse data distributions. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2014,36(9):1874-1887.
    [37] Kuleshov V, Ermon S. Neural variational inference and learning in undirected graphical models. In:Advances in Neural Information Processing Systems. 2017.
    [38] Nair V, Hinton G. Rectified linear units improve restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Machine Learning. 2010. 807-814.
    [39] Yang E, Ravikumar P, Allen G, et al. Graphical models via generalized linear models. In:Advances in Neural Information Processing Systems. 2012.
    [40] Tran T, Phung DQ, Venkatesh S. Mixed-variate restricted Boltzmann machines. In:Proc. of the Asian Conf. on Machine Learning. 2011. 213-229.
    [41] Nguyen TD, Tran T, Phung D, et al. Latent patient profile modelling and applications with mixed-variate restricted Boltzmann machine. In:Proc. of the Pacific-Asia Conf. on Knowledge Discovery and Data Mining. 2013. 123-135.
    [42] Tran T, Phung DQ, Venkatesh S. Cumulative restricted Boltzmann machines for ordinal matrix data analysis. In:Proc. of the Asian Conf. on Machine Learning. 2012. 411-426.
    [43] Tran T, Phung DQ, Venkatesh S. Thurstonian Boltzmann machines:Learning from multiple inequalities. In:Proc. of the Int'l Conf. on Machine Learning. 2013. 46-54.
    [44] Feng F, Li R, Wang X. Deep correspondence restricted Boltzmann machine for cross-modal retrieval. Neurocomputing, 2015,154:50-60.
    [45] Zhao F, Huang Y, Wang L, et al. Learning relevance restricted Boltzmann machine for unstructured group activity and event understanding. Int'l Journal of Computer Vision, 2016,119(3):329-345.
    [46] Larochelle H, Mandel M, Pascanu R, et al. Learning algorithms for the classification restricted Boltzmann machine. Journal of Machine Learning Research, 2012,13(1):643-669.
    [47] Lee T, Yoon S. Boosted categorical restricted Boltzmann machine for computational prediction of splice junctions. In:Proc. of the Int'l Conf. on Machine Learning. 2015.
    [48] Chen CLP, Zhang CY, Chen L, et al. Fuzzy restricted Boltzmann machine for the enhancement of deep learning. IEEE Trans. on Fuzzy Systems, 2015,23(6):2163-2173.
    [49] Johnson MJ, Duvenaud D, Wiltschko AB, et al. Composing graphical models with neural networks for structured representations and fast inference. arXiv:1603.06277, 2016.
    [50] Lee H, Grosse R, Ranganath R, Ng AY. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In:Proc. of the Int'l Conf. on Machine Learning. ACM, 2009. 609-616.
    [51] Lin M, Chen Q, Yan S. Network in network. arXiv:1312.4400.
    [52] Norouzi M, Ranjbar M, Mori G. Stacks of convolutional restricted Boltzmann machines for shift-invariant feature learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2009. 2735-2742.
    [53] Lee H, Pham P, Largman Y, Ng AY. Unsupervised feature learning for audio classification using convolutional deep belief networks. In:Advances in Neural Information Processing Systems. 2009. 1096-1104.
    [54] Lee H, Grosse R, Ranganath R, Ng AY. Unsupervised learning of hierarchical representations with convolutional deep belief networks. Communications of the ACM, 2011,54(10):95-103.
    [55] Chen L, Papandreou G, Kokkinos I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. Computer Science, 2014,(4):357-361.
    [56] Hinton GE. To recognize shapes, first learn to generate images. Progress in Brain Research, 2007,165(6):535-547.
    [57] Larochelle H, Bengio Y. Classification using discriminative restricted Boltzmann machines. In:Proc. of the Int'l Conf. DBLP, 2008.
    [58] Carlson D, Cevher V, Carin L. Stochastic spectral descent for restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics. San Diego, 2015.
    [59] Telgarsky M. Representation benefits of deep feedforward networks. Computer Science, 2015,15(8):1204-1211.
    [60] Chui CK, Li X, Mhaskar HN. Neural networks for localized approximation. Mathematics of Computation, 1994,63(208):607-623.
    [61] Eldan R, Shamir O. The power of depth for feedforward neural networks. In:Proc. of the Annual Conf. on Learning Theory. 2016. 907-940.
    [62] Shaham U, Cheng X, Dror O, Jaffe A, et al. A deep learning approach to unsupervised ensemble learning. arXiv Preprint arXiv:1602.02285, 2016.
    [63] Djork-Arné C, Unterthiner T, Hochreiter S. Fast and accurate deep network learning by exponential linear units (ELUs). Computer Science, 2015.
    [64] Klambauer G, Unterthiner T, Mayr A, et al. Self-normalizing neural networks. In:Proc. of the NIPS. 2017.
    [65] Srivastava RK, Greff K, Schmidhuber J. Highway networks. Computer Science, 2015.
    [66] He KM, Zhang X, Ren S, et al. Deep residual learning for image recognition. In:Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016.[doi:10.1109/CVPR.2016.90]
    [67] Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift. In:Proc. of the 32nd Int'l Conf. on Machine Learning. 2015. 448-456.
    [68] Srivastava N, Hinton GE, Krizhevsky A, et al. Dropout:A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014,15:1929-1958.
    [69] Wan L, Zeiler M, S. Zhang, et al. Regularization of neural networks using dropconnect. In:Proc. of the Int'l Conf. on Machine Learning. 2013. 1058-1066.
    [70] Zhang N, Ding SF, Zhang J, Xue Y. Research on point-wise gated deep networks. Applied Soft Computing, 2017,52:1210-1221.
    [71] Zhang J, Ding SF, Zhang N, Xue Y. Weight uncertainty in Boltzmann machine. Cognitive Computation, 2016,8(6):1064-1073.
    [72] Tai KS, Socher R, Manning CD. Improved semantic representations from tree-structured long short-term memory networks. Computer Science, 2015,5(1):36.
    [73] Chung J, Gulcehre C, Cho KH, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. Eprint Arxiv, 2014.
    Related
    Cited by
Get Citation

张健,丁世飞,张楠,杜鹏,杜威,于文家.受限玻尔兹曼机研究综述.软件学报,2019,30(7):2073-2090

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 20,2018
  • Revised:December 27,2018
  • Online: April 11,2019
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063