基于动态批量评估的绿色无梯度优化方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

周爱民,E-mail:amzhou@cs.ecnu.edu.cn

中图分类号:

基金项目:

国家自然科学基金(62106076); 上海市“科技创新行动计划”人工智能科技支撑专项(22511105901); CCF-蚂蚁科研基金(CCF-AFSG RF20220205); 上海市自然科学基金(21ZR1420300)


Green Derivative-free Optimization Method with Dynamic Batch Evaluation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在基于语言模型即服务的提示词黑盒微调、机器学习模型超参数调节等优化任务中, 由于解空间到性能指标之间的映射关系复杂多变, 难以显式地构建目标函数, 故常采用无梯度优化方法来实现寻优. 解的准确、稳定评估是有效实施无梯度优化方法的关键, 完成一次解的质量评估常要求在整个数据集上完整运行一次模型, 且优化过程有时需要大量评估解的质量. 随着机器学习模型复杂度以及训练样本量的不断增加, 准确、稳定的解的质量评估时间成本与计算代价越来越高昂, 这与绿色低碳机器学习与优化理念背道而驰. 有鉴于此, 提出了一种基于动态批量评估的绿色无梯度优化方法框架(green derivative-free optimization with dynamic batch evaluation, GRACE), 基于训练子集的相似性, 在优化过程中自适应动态调节评估解时使用的样本量, 使得GRACE在保证优化性能的同时, 降低优化成本与代价, 达到绿色低碳高效的目标. 在语言模型即服务提示词黑盒微调、模型超参数优化等实际任务上进行了实验验证, 通过与一系列对比方法以及GRACE消融退化版算法进行比较分析, 表明了GRACE的有效性、高效性、绿色低碳性. 超参数分析结果表明了其具备超参数稳健性.

    Abstract:

    Derivative-free optimization is commonly employed in tasks such as black-box tuning of language-model-as-a-service and hyper-parameter tuning of machine learning models, where the mapping between the solution space of the optimization task and the performance indicator is intricate and complex, making it challenging to explicitly formulate an objective function. Accurate and stable evaluation of solutions is crucial for derivative-free optimization methods. The evaluation of the quality of a solution often requires running the model on the entire dataset, and the optimization process sometimes requires a large number of evaluations of solution quality. The growing complexity of machine learning models and the expanding size of training datasets result in escalating time and computational costs for accurate and stable solution evaluation, contradicting the principle of green and low-carbon machine learning and optimization. In view of this, this study proposes a green derivative-free optimization framework with dynamic batch evaluation (GRACE). Based on the similarity of training subsets, GRACE adaptively and dynamically adjusts the sample size used for evaluating solutions during the optimization process, thereby ensuring optimization performance while reducing optimization costs and computational expenses, achieving the goal of green, low-carbon, and efficient optimization. Experiments are conducted on tasks such as black-box tuning of language-model-as-a-service and hyper-parameter optimization of models. By comparing with the comparative methods and the degraded versions of GRACE, the effectiveness, efficiency, and green and low-carbon merits of GRACE are verified. The results also show the hyper-parameter robustness of GRACE.

    参考文献
    相似文献
    引证文献
引用本文

钱鸿,舒翔,孙天祥,邱锡鹏,周爱民.基于动态批量评估的绿色无梯度优化方法.软件学报,2024,35(4):1732-1750

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-05-15
  • 最后修改日期:2023-07-07
  • 录用日期:
  • 在线发布日期: 2023-09-11
  • 出版日期: 2024-04-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号