基于归一化的自适应方差缩减方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家自然科学基金(62122037)


Normalized Adaptive Variance Reduction Method
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随机优化算法是机器学习中处理大规模数据和复杂模型的重要方法. 其中, 方差缩减方法(如STORM算法)因其在随机非凸优化问题中能够实现最优的$ {\mathrm{O}}\left({T}^{-1/3}\right) $收敛速率而受到广泛关注. 然而, 传统的方差缩减方法通常需要依赖特定的问题参数(如光滑系数、噪声方差和梯度上界)来设置学习率和动量, 使得它们在实际应用中难以直接使用. 为了解决这一问题, 提出了一种基于归一化的自适应方差缩减方法, 该方法无需预先知道问题参数, 仍然能够实现最优的收敛速率. 与现有的自适应方差缩减方法相比, 所提方法具有以下显著优势: (1)无需依赖额外假设, 如梯度有界、函数值有界或极大的初始批量大小; (2)实现了最优的$ {\mathrm{O}}\left({T}^{-1/3}\right) $收敛速率, 不包含额外的$ \mathrm{O}\left(\mathrm{log}T\right) $项; (3)证明过程简洁明了, 便于推广到其他随机优化问题. 最后, 通过数值实验将该方法与其他方法进行了对比, 验证了其优越性.

    Abstract:

    Stochastic optimization algorithms are recognized as essential for addressing large-scale data and complex models in machine learning. Among these, variance reduction methods, such as the STORM algorithm, have gained attention for their ability to achieve optimal convergence rates of $ {\mathrm{O}}\left({T}^{-1/3}\right) $. However, traditional variance reduction methods typically depend on specific problem parameters (e.g., the smoothness constant, noise variance, and gradient upper bound) for setting the learning rate and momentum, limiting their practical applicability. To overcome this limitation, this study proposes an adaptive variance reduction method based on a normalization technique, which eliminates the need for prior knowledge of problem parameters while maintaining optimal convergence rates. Compared to existing adaptive variance reduction methods, the proposed approach offers several advantages: (1) no reliance on additional assumptions, such as bounded gradients, bounded function values, or excessively large initial batch sizes; (2) the achievement of the optimal convergence rate of $ {\mathrm{O}}\left({T}^{-1/3}\right) $without extra term of $ {\mathrm{O}}\left(\mathrm{log}T\right)$; (3) a concise and straightforward proof, facilitating extensions to other stochastic optimization problems. The superiority of the proposed method is further validated through numerical experiments, demonstrating enhanced performance when compared to other approaches.

    参考文献
    相似文献
    引证文献
引用本文

姜伟,杨斯凡,王一博,张利军.基于归一化的自适应方差缩减方法.软件学报,,():1-13

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-09-05
  • 最后修改日期:2024-11-13
  • 录用日期:
  • 在线发布日期: 2025-04-18
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号