[关键词]
[摘要]
与梯度下降法相比,自适应梯度下降方法(AdaGrad)利用过往平方梯度的算数平均保存了历史数据的几何信息,在处理稀疏数据时获得了更紧的收敛界.另一方面,Nesterov加速梯度方法(Nesterov’s accelerated gradient,NAG)在梯度下降法的基础上添加了动量运算,在求解光滑凸优化问题时具有数量级加速收敛的性能,在处理非光滑凸问题时也获得了最优的个体收敛速率.最近,已经出现了自适应策略与NAG相结合的研究,但现有代表性的自适应NAG方法AcceleGrad由于采取的自适应方式与AdaGrad不同,步长未能在不同维度上体现差异性,仅得到了加权平均方式的收敛速率,个体收敛速率的理论分析尚存在缺失.提出了一种自适应NAG方法,继承了AdaGrad的步长设置方式,证明了所提算法在解决约束非光滑凸优化问题时具有最优的个体收敛速率.在L1范数约束下,通过求解典型的hinge损失函数分类和L1损失函数回归优化问题.实验验证了理论分析的正确性,也表明了所提算法的性能优于AcceleGrad.
[Key word]
[Abstract]
Compared with the gradient descent, the adaptive gradient descent (AdaGrad) keeps the geometric information of historical data by using the arithmetic average of squared past gradients, and obtains tighter convergence bounds when copingwith sparse data. On the other hand, by adding the momentum term to gradient descent, Nesterov’s accelerated gradient (NAG) not only obtains order of magnitude accelerated convergence for solving smooth convex optimization problems, but also achieves the optimal individual convergence rate for non-smooth convex problems. Recently, there have been studies on the combination of adaptive strategy and NAG. However, as a typical adaptive NAG algorithm, AcceleGrad fails to reflect the distinctions between dimensions due to using different adaptive variant from AdaGrad, and it only obtains the weighted averaging convergence rate. So far, there still lacks the theoretical analysis of individual convergence rate. In this study, an adaptive NAG method, which inherits AdaGrad’s step size setting, is proposed. It is proved that the proposed algorithm attains the optimal individual convergence rate when solving the constrained non-smooth convex optimization problems. The experiments are conducted on the typical optimization problem of hinge loss function for classification and L1 loss function for regression with L1 norm constraint, and experimental results verify the correctness of the theoretical analysis and superior performance of the proposed algorithm over AcceleGrad.
[中图分类号]
[基金项目]
国家自然科学基金(62076252,62106281)