梯度有偏情形非光滑问题NAG的个体收敛性
作者:
作者单位:

作者简介:

刘宇翔(1992-),男,江西永丰人,硕士,主要研究领域为机器学习,模式识别;陶卿(1965-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为机器学习,模式识别,应用数学;程禹嘉(1996-),女,硕士,主要研究领域为机器学习,模式识别.

通讯作者:

陶卿,E-mail:qing.tao@ia.ac.cn

中图分类号:

TP181

基金项目:

国家自然科学基金(61673394)


Individual Convergence of NAG with Biased Gradient in Nonsmooth Cases
Author:
Affiliation:

Fund Project:

National Natural Science Foundation of China (61673394)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随机优化方法已经成为处理大规模正则化和深度学习优化问题的首选方法,其收敛速率的获得通常都建立在目标函数梯度无偏估计的基础上,但对机器学习问题来说,很多现象都导致了梯度有偏情况的出现.与梯度无偏情形不同的是,著名的Nesterov加速算法NAG(Nesterov accelerated gradient)会逐步累积每次迭代中的梯度偏差,从而导致不能获得最优的收敛速率甚至收敛性都无法保证.近期的研究结果表明,NAG方法也是求解非光滑问题投影次梯度关于个体收敛的加速算法,但次梯度有偏对其影响的研究未见报道.针对非光滑优化问题,证明了在次梯度偏差有界的情况下,NAG能够获得稳定的个体收敛界,而当次梯度偏差按照一定速率衰减时,NAG仍然可获得最优的个体收敛速率.作为应用,得到了一种无需精确计算投影的投影次梯度方法,可以在保持收敛性的同时较快地达到稳定学习的精度.实验验证了理论分析的正确性及非精确方法的性能.

    Abstract:

    Stochastic method has become the first choice for dealing with large-scale regularization and deep learning optimization problems. The acquisition of its convergence rate heavily depends on the unbiased gradient of objective functions. However, for machine learning problems, many scenarios can result in the appearance of biased gradient. In contrast to the unbiased gradient cases, the well-known Nesterov accelerated gradient (NAG) accumulates the error caused by the bias with the iteration. As a result, the optimal convergence will no longer hold and even the convergence cannot be guaranteed. Recent research shows that NAG is also an accelerated algorithm for the individual convergence of projection sub-gradient methods in non-smooth cases. However, until now, there is no report about the affect when the subgradient becomes biased. In this study, for non-smooth optimization problems, it is proved that NAG can obtain a stable individual convergence bound when the subgradient bias is bounded, and the optimal individual convergence can still be achieved while the subgradient errors decrease at an appropriate. As an application, an inexact projection subgradient method is obtained in which the projection needs not calculate accurately. The derived algorithm can approach the stable learning accuracy more quick while keeping the convergence. The experiments verify the correctness of theoretical analysis and the performance of inexact methods.

    参考文献
    相似文献
    引证文献
引用本文

刘宇翔,程禹嘉,陶卿.梯度有偏情形非光滑问题NAG的个体收敛性.软件学报,2020,31(4):1051-1062

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-05-31
  • 最后修改日期:2019-08-01
  • 录用日期:
  • 在线发布日期: 2020-01-14
  • 出版日期: 2020-04-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号