NUMA感知的持久内存存储引擎优化设计
作者:
作者单位:

作者简介:

屠要峰(1972-),男,博士生,研究员,CCF高级会员,主要研究领域为大数据,数据库,机器学习,云计算;
闫宗帅(1987-),男,硕士,主要研究领域为数据库,分布式系统;
陈河堆(1972-),男,硕士,高级工程师,CCF专业会员,主要研究领域为数据库,分布式系统,数据挖掘与分析;
孔鲁(1989-),男,硕士,主要研究领域为数据库,分布式系统;
王涵毅(1982-),男,硕士,CCF专业会员,主要研究领域为数据库,分布式系统;
陈兵(1970-),男,教授,博士生导师,CCF杰出会员,主要研究领域为大数据,云计算,认知无线电网络.

通讯作者:

屠要峰,E-mail:13605151819@qq.com

中图分类号:

基金项目:

国家重点研发计划(2019YFB2102002);江苏省重点研发计划(BE2019012)


Optimal Design of NUMA-aware Persistent Memory Storage Engine
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    持久性内存(persist memory,PM)具有非易失、字节寻址、低时延和大容量等特性,打破了传统内外存之间的界限,对现有软件体系结构带来颠覆性影响.但是,当前PM硬件还存在着磨损不均衡、读写不对称等问题,特别是当跨NUMA (non uniform memory access)节点访问PM时,存在着严重的I/O性能衰减问题.提出了一种NUMA感知的PM存储引擎优化设计,并应用到中兴新一代数据库系统GoldenX中,显著降低了数据库系统跨NUMA节点访问持久内存的开销.主要创新点包括:提出了一种DRAM+PM混合内存架构下跨NUMA节点的数据空间分布策略和分布式存取模型,实现了PM数据空间的高效使用;针对跨NUMA访问PM的高开销问题,提出了I/O代理例程访问方法,将跨NUMA访问PM开销转化为一次远程DRAM内存拷贝和本地访问PM的开销,设计了Cache Line Area (CLA)缓存页机制,缓解了I/O写放大问题,提升了本地访问PM的效率;扩展了传统表空间概念,让每个表空间既拥有独立的表数据存储,也拥有专门的WAL (write-ahead logging)日志存储,针对该分布式WAL存储架构提出了一种基于全局顺序号的事务处理机制,解决了单点WAL性能瓶颈问题,并实现了NUMA感知的事务处理、检查点和灾难恢复等优化机制及算法.实验结果表明,所提出的方法能够有效提升NUMA架构下PM存储引擎的性能,在YCSB多种测试场景下分别提升了105%-317%,在TPC-C场景下提升了90%-134%.关键词:数据库;存储引擎;持久性内存;NUMA (non uniform memory access);WAL (write-ahead logging)

    Abstract:

    Persistent memory (PM) has the characteristics of non-volatility, byte addressable, low latency, and large capacity, which breaks the boundary between traditional internal and external memory and has a has a disruptive impact on the existing software architecture. However, the current PM hardware still has problems such as uneven wear and asymmetric read and write. Especially serious I/O performance degradation problem will occur when the CPU accesses the PM across NUMA (non uniform memory access) nodes. An NUMA-aware PM storage engine optimization design is proposed and applied to Zhongxing’s new generation database system GoldenX, which significantly reduces the overhead of database system accessing persistent memory across NUMA nodes. The main innovations include: a data space distribution strategy and distributed access model across NUMA nodes are proposed under a DRAM+PM hybrid memory architecture, which realizes the efficient use of PM data space; aiming at the high latency problem of accessing PM across NUMA nodes, an I/O proxy routines access method is proposed, which converts the overhead of accessing PM across NUMA into the overhead of a remote DRAM memory copy and local access to PM. The Cache Line Area cache page mechanism is designed to alleviate the problem of I/O write amplification and improve the efficiency of local access to PM. The concept of traditional table space is extended, so that each table space has both independent table data storage and dedicated WAL (write-ahead logging) storage. For the distributed WAL storage architecture, a transaction processing mechanism based on global sequence numbers is proposed, which addresses the problem of single-point the WAL performance bottleneck, and implement NUMA-aware transaction processing, checkpoint and disaster recovery optimization mechanisms and algorithms. Experimental results show that the method proposed in this study can effectively improve the performance of the PM storage engine under the NUMA architecture, by 105%-317% in various test scenarios of YCSB and 90%-134% in TPC-C.

    参考文献
    相似文献
    引证文献
引用本文

屠要峰,陈河堆,王涵毅,闫宗帅,孔鲁,陈兵. NUMA感知的持久内存存储引擎优化设计.软件学报,2022,33(3):891-908

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-06-30
  • 最后修改日期:2021-07-31
  • 录用日期:
  • 在线发布日期: 2021-10-21
  • 出版日期: 2022-03-06
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号