多目标深度强化学习驱动的数据库系统参数优化技术
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家自然科学基金(62062058)


Technique for Database System Parameter Optimization Using Multi-objective Deep Reinforcement Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    数据库系统的参数配置直接影响其性能和系统资源的利用率. 主流的关系数据库管理系统有数百个参数可供调整以获得最佳的性能和服务能力. 数据库系统性能的优化通常由经验丰富的数据库管理员(DBA)手动进行, 但是由于数据库系统配置参数众多、异构且参数之间的存在复杂的相关性, 传统的人工进行参数调优的工作方法效率低、成本高、可复用性差. 为了提高数据库系统性能优化的工作效率, 数据库系统的自动化参数调优技术成为数据库领域的研究热点. 由于强化学习具有与系统运行环境交互、反馈并逐步优化的能力, 被广泛应用于复杂系统的优化过程. 相关的研究工作将强化学习及其改进方法应用于数据库系统的参数优化, 但是都采用单目标优化的方法.实际上, 数据库系统的参数优化属于多目标优化任务, 且调优工作常在资源受限的情况下进行, 因此现有的工作存在一些缺陷: (1)将数据库系统优化任务的多个目标通过简单线性转换为单目标优化问题具有一定的盲目性, 需要反复迭代尝试优化, 实现成本高; (2)无法应对数据库系统需求的动态变化, 适用性差; (3)相关工作使用的强化学习方法本身是属于单目标优化算法, 将其应用于多目标任务时, 导致难以有效对齐偏好(当前的各个目标的权重系数)和相应的最优策略, 可能产生次优解; (4)现有数据库系统参数优化的目标通常仅考虑吞吐量和延迟, 未考虑内存等资源的利用率. 针对以上问题, 设计一种基于多目标深度确定性策略梯度的强化学习算法(MODDPG). 该方法是原生多目标的强化学习方法, 不需要将数据库系统优化的多目标任务转换为单目标任务, 可以高效适应数据库系统需求的动态变化. 通过改进强化学习算法的奖励机制可以快速实现偏好与最优策略的对齐, 有效避免次优解的产生, 提高数据库系统参数优化的效率. 为了更进一步验证所提方法的普遍适用性, 将提出的多目标优化的方法进行扩展, 实现了提升数据库的性能和资源利用率的多目标协同优化. 实验部分在主流关系数据库系统上使用TPC-C 和SYSBench测试基准对所提参数优化方法的有效性和实用性进行了验证. 实验结果表明, 所提方法在模型的训练效率和数据库参数优化的作用方面具有明显的优势, 并且易于根据优化需求扩展到更多目标.

    Abstract:

    The tuning of database system parameters directly impacts its performance and the utilization of system resources. Relational database management systems typically offer hundreds of parameters that can be adjusted to achieve optimal performance and service capabilities. Database system performance optimization is traditionally carried out manually by experienced database administrators (DBAs). However, due to the characteristics of parameter tuning, such as the large number of parameters, their heterogeneity, and the complex correlations among them, traditional manual methods are inefficient, costly, and lack reusability. To enhance the efficiency of database system performance optimization, automated parameter tuning techniques have become a key focus in the database field. Reinforcement learning, with its ability to interact with the system environment and gradually improve through feedback, has been widely applied in the optimization of complex systems. Some related studies have applied reinforcement learning or its variants to database parameter tuning, but they have relied on single-objective optimization methods. Database system parameter tuning is a multi-objective optimization task, usually performed under resource constraints. Therefore, existing methods have several limitations: (1) transforming the multi-objective optimization problem into a single-objective optimization problem through simple linear transformations requires iterative attempts, making optimizations costly; (2) existing methods cannot adapt to the dynamic changes in database system requirements, limiting their adaptability; (3) reinforcement learning methods used in existing studies are designed for single-objective optimization, and their applications to multi-objective tasks make it difficult to effectively align preferences (the weight coefficients of current objectives) with corresponding optimal strategies, potentially leading to suboptimal solutions; (4) existing research primarily focuses on optimizing throughput and latency, while ignoring resource utilization such as memory. To address these issues, this study proposes a multi-objective deep deterministic policy gradient-based reinforcement learning algorithm (MODDPG). This method is a native multi-objective reinforcement learning approach that does not require transforming the multi-objective task of database system parameters tuning into a single-objective task, enabling it to efficiently adapt to dynamic changes in database system requirements. By improving the reward mechanism of the reinforcement learning algorithm, the alignment between preferences and optimal strategies can be quickly achieved, effectively avoiding suboptimal solutions. Consequently, the training process of the reinforcement learning model can be accelerated, and the efficiency of database system parameter tuning can be improved. To further validate the generality of the proposed method, the multi-objective optimization approach is extended to achieve a collaborative optimization goal of improving both database performance and resource utilization. Experiments using TPC-C and SYSBench benchmarks demonstrate the effectiveness and practicality of the proposed parameter tuning method. The results show significant advantages in terms of model training efficiency and the effectiveness of database parameter tuning.

    参考文献
    相似文献
    引证文献
引用本文

荣垂田,田浩辉,杜方.多目标深度强化学习驱动的数据库系统参数优化技术.软件学报,,():1-25

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-06-07
  • 最后修改日期:2024-09-22
  • 录用日期:
  • 在线发布日期: 2025-07-17
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号