受限玻尔兹曼机研究综述
作者:
作者简介:

张健(1990-),男,山东泰安人,博士生,主要研究领域为深度学习,玻尔兹曼机;杜鹏(1994-),男,硕士生,主要研究领域为深度学习,数据挖掘;丁世飞(1963-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为人工智能,模式识别,机器学习,数据挖掘;杜威(1994-),男,硕士生,主要研究领域为深度学习,强化学习;张楠(1991-),男,博士生,CCF学生会员,主要研究领域为机器学习,玻尔兹曼机;于文家(1994-),男,硕士生,主要研究领域为深度学习,生成对抗网络.

通讯作者:

丁世飞,E-mail:dingsf@cumt.edu.cn

中图分类号:

TP181

基金项目:

国家自然科学基金(61672522,61379101);国家重点基础研究发展计划(973)(2013CB329502);江苏省研究生科研与实践创新计划(KYCX19_2166);中国矿业大学研究生科研与实践创新计划(KYCX19_2166)


Restricted Boltzmann Machines: A Review
Author:
Fund Project:

National Natural Science Foundation of China (61672522, 61379101); National Key Basic Research Program of China (973) (2013CB329502); Postgraduate Research & Practice Innovation Program of China University of Mining Technology (KYCX19_2166); Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX19_2166)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [73]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    概率图模型是目前机器学习研究的热点,基于概率图模型构造的生成模型已广泛应用于图像和语音处理等领域.受限玻尔兹曼机(restricted Boltzmann machines,简称RBMs)是一种概率无向图,在建模数据分布方面有重要的研究价值,RBMs既可以结合卷积算子构造深度判别模型,为深度网络提供统计力学的理论支持,也可以结合有向图构建生成模型,提供具有多峰分布的先验信息.主要综述了以RBMs为基础的概率图模型的相关研究.首先介绍了基于RBMs的机器学习模型的基本概念和训练算法,并讨论了基于极大似然估计的各训练算法的联系,比较了各算法的log似然损失;其次,综述了RBMs模型最新的研究进展,包括在目标函数中引入对抗损失和W距离,并构造基于RBMs先验的变分自编码模型(variational autoencoders,简称VAEs)、基于对抗损失的RBMs模型,并讨论了各实值RBMs模型之间的联系和区别;最后,综述了以RBMs为基础的模型在深度学习中的应用,并讨论了神经网络和RBMs模型在研究中存在的问题及未来的研究方向.

    Abstract:

    The Probabilistic graph is a research hotspot in machine learning at present. Generative models based on probabilistic graphs model have been widely used in image generation and speech processing. The restricted Boltzmann machines (RBMs) is a probabilistic undirected graph, which has important research value in modeling data distribution. On the one hand, the RBMs model can be used to construct deep neural network, and on the other hand, it can provide statistical support of deep nets. This paper mainly summarizes the related research of RBMs based probability graph model and their applications in image recognition. Firstly, this paper introduces the basic concepts and training algorithms of RBMs. Secondly, this paper summarizes the applications of RBMs in deep learning; and then, this paper discusses existing problems in research of neural nets and RBMs. Finally, this paper gives a summary and prospect of the research on the RBMs.

    参考文献
    [1] Koller D, Friedman N. Probabilistic Graphical Models:Principles and Techniques-Adaptive Computation and Machine Learning. MIT Press, 2009.
    [2] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science, 2006,313(5786):504-507.
    [3] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Computation, 2006,18:1527-1554.
    [4] Hinton GE. Products of experts. In:Proc. of the Int'l Conf. on Artificial Neural Networks. 1999,1:1-6.
    [5] Hinton GE. A practical guide to training restricted Boltzmann machines. In:Neural Networks:Tricks of the Trade. Berlin, Heidelberg:Springer-Verlag, 2012. 599-619.
    [6] Ravanbakhsh S. Learning in Markov random fields using tempered transitions. In:Advances in Neural Information Processing Systems. 2009. 1598-1606.
    [7] Welling M, Rosen-Zvi M, Hinton G. Exponential family harmoniums with an application to information retrieval. In:Proc. of the Int'l Conf. on Neural Information Processing Systems. 2004. 1481-1488.
    [8] Ravanbakhsh S, Poczos B, Schneider J, et al. Stochastic neural networks with monotonic activation functions. arXiv:1601.00034v4, 2016. 573-577.
    [9] Osindero S, Hinton G. Modeling image patches with a directed hierarchy of Markov random fields. In:Proc. of the Int'l Conf. on Neural Information Processing Systems. 2007. 1121-1128.
    [10] Larochelle H, Bengio Y, Louradour J, et al. Exploring strategies for training deep neural networks. Journal of Machine Learning Research, 2009,1(10):1-40.
    [11] Salakhutdinov R, Hinton GE. Deep Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics. 2009. 448-455.
    [12] Salakhutdinov R, Hinton GE. An efficient learning procedure for deep Boltzmann machines. Neural Computation, 2012,24(8):1967-2006.
    [13] Hinton GE, Salakhutdinov R. A better way to pretrain deep Boltzmann machines. In:Advances in Neural Information Processing Systems. 2012,3:2447-2455.
    [14] Goodfellow I, Mirza M, Courville A, Bengio Y. Multi-prediction deep Boltzmann machines. In:Advances in Neural Information Processing Systems. 2013. 548-556.
    [15] Hinton GE. Training products of experts by minimizing contrastive divergence. Neural Computation, 2002,14(8):1711-1800.
    [16] Jordan MI, Ghahramani Z, Jaakkola TS, et al. An introduction to variational methods for graphical models. Machine Learning, 1999,37(2):183-233.
    [17] Geman S, Geman D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1984,6(6):721-741.
    [18] Tieleman T. Training restricted Boltzmann machines using approximations to the likelihood gradient. In:Proc. of the Int'l Conf. on Machine Learning. Helsinki, 2008. 1064-1071.
    [19] Tieleman T, Hinton GE. Using fast weights to improve persistent contrastive divergence. In:Proc. of the Annual Int'l Conf. on Machine Learning. Montreal, 2009. 1033-1040.
    [20] Desjardins G, Courville A, Bengio Y, et al. Tempered Markov chain Monte Carlo for training of restricted Boltzmann machines. Technical Reprot, 1345, University of Montreal, 2009.
    [21] Desjardins G, Courville A, Bengio Y, et al. Parallel tempering for training of restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics. 2010. 145-152.
    [22] Cho KH, Raiko T, Ilin A. Parallel tempering is efficient for learning restricted Boltzmann machines. In:Proc. of the Int'l Joint Conf. on Neural Networks. 2010. 605-616.
    [23] Salakhutdinov R. Learning in Markov random fields using tempered transitions. In:Advances in Neural Information Processing Systems. 2009. 1598-1606.
    [24] Welling M, Hinton GE. A new learning algorithm for mean field Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Neural Networks. Springer-Verlag, 2002. 351-357.
    [25] Montavon G, Müller K, Cuturi M. Wasserstein training of restricted Boltzmann machines. In:Advances in Neural Information Processing Systems. 2017.
    [26] Fisher C, Smith A, Walsh J. Boltzmann encoded adversarial machines. arXiv:1804.08682, 2018.
    [27] Krizhevsky A. Learning multiple layers of features from tiny images[MS. Thesis]. Department of Computer Science, University of Toronto, 2009.
    [28] Cho KH, Ilin A, Raiko T. Improved learning of Gaussian-Bernoulli restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Neural Networks. Berlin, Heidelberg:Springer-Verlag, 2011. 10-17.
    [29] Ranzato M, Krizhevsky A, Hinton GE. Factored 3-way restricted Boltzmann machines for modeling natural images. Journal of Machine Learning Research, 2010,9:621-628.
    [30] Ranzato M, Hinton GE. Modeling pixel means and covariances using factorized third-order Boltzmann machines. In:Proc. of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. 2010. 2551-2558.
    [31] Courville A, Bergstra J, Bengio Y. A Spike and Slab restricted Boltzmann machine. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics (AISTATS). Fort Lauderdale, 2011. 233-241.
    [32] Courville AC, Bergstra J, Bengio Y. Unsupervised models of images by Spike and-Slab RBMs. In:Proc. of the Int'l Conf. on Machine Learning. Washington, 2011. 1145-1152.
    [33] Goodfellow IJ, Courville A, Bengio Y. Spike-and-Slab sparse coding for unsupervised feature discovery. arXiv Preprint arXiv:1201.3382, 2012.
    [34] Huang. H, Toyoizumi. T. Advanced mean-field theory of the restricted Boltzmann machine. Physical Review E Statistical Nonlinear & Soft Matter Physics, 2015,91(5).
    [35] Goodfellow IJ, Courville A, Bengio Y. Large-scale feature learning with Spike-and-Slab sparse coding. In:Proc. of the Int'l Conf. on Machine Learning. Edinburgh, 2012.
    [36] Courville A, Desjardins G, Bergstra J, et al. The Spike-and-Slab RBM and extensions to discrete and sparse data distributions. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2014,36(9):1874-1887.
    [37] Kuleshov V, Ermon S. Neural variational inference and learning in undirected graphical models. In:Advances in Neural Information Processing Systems. 2017.
    [38] Nair V, Hinton G. Rectified linear units improve restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Machine Learning. 2010. 807-814.
    [39] Yang E, Ravikumar P, Allen G, et al. Graphical models via generalized linear models. In:Advances in Neural Information Processing Systems. 2012.
    [40] Tran T, Phung DQ, Venkatesh S. Mixed-variate restricted Boltzmann machines. In:Proc. of the Asian Conf. on Machine Learning. 2011. 213-229.
    [41] Nguyen TD, Tran T, Phung D, et al. Latent patient profile modelling and applications with mixed-variate restricted Boltzmann machine. In:Proc. of the Pacific-Asia Conf. on Knowledge Discovery and Data Mining. 2013. 123-135.
    [42] Tran T, Phung DQ, Venkatesh S. Cumulative restricted Boltzmann machines for ordinal matrix data analysis. In:Proc. of the Asian Conf. on Machine Learning. 2012. 411-426.
    [43] Tran T, Phung DQ, Venkatesh S. Thurstonian Boltzmann machines:Learning from multiple inequalities. In:Proc. of the Int'l Conf. on Machine Learning. 2013. 46-54.
    [44] Feng F, Li R, Wang X. Deep correspondence restricted Boltzmann machine for cross-modal retrieval. Neurocomputing, 2015,154:50-60.
    [45] Zhao F, Huang Y, Wang L, et al. Learning relevance restricted Boltzmann machine for unstructured group activity and event understanding. Int'l Journal of Computer Vision, 2016,119(3):329-345.
    [46] Larochelle H, Mandel M, Pascanu R, et al. Learning algorithms for the classification restricted Boltzmann machine. Journal of Machine Learning Research, 2012,13(1):643-669.
    [47] Lee T, Yoon S. Boosted categorical restricted Boltzmann machine for computational prediction of splice junctions. In:Proc. of the Int'l Conf. on Machine Learning. 2015.
    [48] Chen CLP, Zhang CY, Chen L, et al. Fuzzy restricted Boltzmann machine for the enhancement of deep learning. IEEE Trans. on Fuzzy Systems, 2015,23(6):2163-2173.
    [49] Johnson MJ, Duvenaud D, Wiltschko AB, et al. Composing graphical models with neural networks for structured representations and fast inference. arXiv:1603.06277, 2016.
    [50] Lee H, Grosse R, Ranganath R, Ng AY. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In:Proc. of the Int'l Conf. on Machine Learning. ACM, 2009. 609-616.
    [51] Lin M, Chen Q, Yan S. Network in network. arXiv:1312.4400.
    [52] Norouzi M, Ranjbar M, Mori G. Stacks of convolutional restricted Boltzmann machines for shift-invariant feature learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2009. 2735-2742.
    [53] Lee H, Pham P, Largman Y, Ng AY. Unsupervised feature learning for audio classification using convolutional deep belief networks. In:Advances in Neural Information Processing Systems. 2009. 1096-1104.
    [54] Lee H, Grosse R, Ranganath R, Ng AY. Unsupervised learning of hierarchical representations with convolutional deep belief networks. Communications of the ACM, 2011,54(10):95-103.
    [55] Chen L, Papandreou G, Kokkinos I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. Computer Science, 2014,(4):357-361.
    [56] Hinton GE. To recognize shapes, first learn to generate images. Progress in Brain Research, 2007,165(6):535-547.
    [57] Larochelle H, Bengio Y. Classification using discriminative restricted Boltzmann machines. In:Proc. of the Int'l Conf. DBLP, 2008.
    [58] Carlson D, Cevher V, Carin L. Stochastic spectral descent for restricted Boltzmann machines. In:Proc. of the Int'l Conf. on Artificial Intelligence and Statistics. San Diego, 2015.
    [59] Telgarsky M. Representation benefits of deep feedforward networks. Computer Science, 2015,15(8):1204-1211.
    [60] Chui CK, Li X, Mhaskar HN. Neural networks for localized approximation. Mathematics of Computation, 1994,63(208):607-623.
    [61] Eldan R, Shamir O. The power of depth for feedforward neural networks. In:Proc. of the Annual Conf. on Learning Theory. 2016. 907-940.
    [62] Shaham U, Cheng X, Dror O, Jaffe A, et al. A deep learning approach to unsupervised ensemble learning. arXiv Preprint arXiv:1602.02285, 2016.
    [63] Djork-Arné C, Unterthiner T, Hochreiter S. Fast and accurate deep network learning by exponential linear units (ELUs). Computer Science, 2015.
    [64] Klambauer G, Unterthiner T, Mayr A, et al. Self-normalizing neural networks. In:Proc. of the NIPS. 2017.
    [65] Srivastava RK, Greff K, Schmidhuber J. Highway networks. Computer Science, 2015.
    [66] He KM, Zhang X, Ren S, et al. Deep residual learning for image recognition. In:Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016.[doi:10.1109/CVPR.2016.90]
    [67] Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift. In:Proc. of the 32nd Int'l Conf. on Machine Learning. 2015. 448-456.
    [68] Srivastava N, Hinton GE, Krizhevsky A, et al. Dropout:A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014,15:1929-1958.
    [69] Wan L, Zeiler M, S. Zhang, et al. Regularization of neural networks using dropconnect. In:Proc. of the Int'l Conf. on Machine Learning. 2013. 1058-1066.
    [70] Zhang N, Ding SF, Zhang J, Xue Y. Research on point-wise gated deep networks. Applied Soft Computing, 2017,52:1210-1221.
    [71] Zhang J, Ding SF, Zhang N, Xue Y. Weight uncertainty in Boltzmann machine. Cognitive Computation, 2016,8(6):1064-1073.
    [72] Tai KS, Socher R, Manning CD. Improved semantic representations from tree-structured long short-term memory networks. Computer Science, 2015,5(1):36.
    [73] Chung J, Gulcehre C, Cho KH, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. Eprint Arxiv, 2014.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

张健,丁世飞,张楠,杜鹏,杜威,于文家.受限玻尔兹曼机研究综述.软件学报,2019,30(7):2073-2090

复制
分享
文章指标
  • 点击次数:4299
  • 下载次数: 6716
  • HTML阅读次数: 4173
  • 引用次数: 0
历史
  • 收稿日期:2018-08-20
  • 最后修改日期:2018-12-27
  • 在线发布日期: 2019-04-11
文章二维码
您是第19780856位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号