PLTree: 一个高性能持久化内存学习索引

doi:10.13328/j.cnki.jos.007198

微信服务号

微信订阅号

2025年6月16日 17:52 星期一

首页 > 过刊浏览>2025年第36卷第5期 >2321-2341. DOI:10.13328/j.cnki.jos.007198

PDF HTML阅读 XML下载导出引用引用提醒

PLTree: 一个高性能持久化内存学习索引
DOI:
                        10.13328/j.cnki.jos.007198
                    
CSTR:
                        32375.14.jos.007198
                    
作者:
                        张志国张志国
区块链与数据安全全国重点实验室(浙江大学), 浙江 杭州 310027;浙江大学 计算机科学与技术学院, 浙江 杭州 310027
在期刊界中查找
在百度中查找
在本站中查找
谢钟乐谢钟乐
区块链与数据安全全国重点实验室(浙江大学), 浙江 杭州 310027;浙江大学 计算机科学与技术学院, 浙江 杭州 310027
在期刊界中查找
在百度中查找
在本站中查找
陈珂陈珂
区块链与数据安全全国重点实验室(浙江大学), 浙江 杭州 310027;浙江大学 计算机科学与技术学院, 浙江 杭州 310027
在期刊界中查找
在百度中查找
在本站中查找
寿黎但寿黎但
区块链与数据安全全国重点实验室(浙江大学), 浙江 杭州 310027;浙江大学 计算机科学与技术学院, 浙江 杭州 310027
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP311
基金项目:浙江省尖兵研发攻关计划(2024C01021); 浙江省科技创新领军人才计划(2023R5214)

PLTree: High-performance Learning Index for Persistent Memory

Author:

ZHANG Zhi-Guo
ZHANG Zhi-Guo
State Key Laboratory of Blockchain and Data Security (Zhejiang University), Hangzhou 310027, China;College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
在期刊界中查找
在百度中查找
在本站中查找
XIE Zhong-Le
XIE Zhong-Le
State Key Laboratory of Blockchain and Data Security (Zhejiang University), Hangzhou 310027, China;College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Ke
CHEN Ke
State Key Laboratory of Blockchain and Data Security (Zhejiang University), Hangzhou 310027, China;College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
在期刊界中查找
在百度中查找
在本站中查找
SHOU Li-Dan
SHOU Li-Dan
State Key Laboratory of Blockchain and Data Security (Zhejiang University), Hangzhou 310027, China;College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

持久化内存(persistent memory, PM)作为主存的补充和替代, 为数据存储提供了相对较低的价格成本, 并且保证了数据的持久化. 为PM设计的传统结构索引(如B+树等)未能充分利用数据分布特点来发挥索引在PM上的读写性能. 最近的研究尝试利用学习索引的数据分布感知能力提升索引在PM上的读写性能并实现持久化. 但在面对真实世界的数据时, 现有基于PM的持久化学习索引的数据结构设计会导致额外的内存访问, 从而影响读写性能. 针对PM学习索引在面对真实数据时读写性能下降的问题, 提出一种DRAM/PM混合架构的学习索引PLTree. 它通过以下方法提升在PM上的读写性能并减轻数据分布颠簸对性能的影响: (1)使用两阶段方法构建索引消除内部节点的局部搜索, 减少PM的访问. (2)利用模型搜索来优化PM上的查找性能并通过在DRAM存储元数据加速查找. (3)根据PM的特性设计了日志式分层溢出缓存结构, 优化写入性能. 实验结果表明, 在不同数据集上, 与现有的持久化内存索引(APEX, FPTree, uTree, NBTree和DPTree)相比, PLTree在索引构建性能上平均提升了约1.9–34倍; 单线程查询/插入性能平均提升了约1.26–4.45倍和2.63–6.83倍; 在多线程场景, 查询/插入性能最高提升了约10.2倍和23.7倍.

关键词:学习型索引;持久化内存;持久化内存索引;数据库

Abstract:

Persistent memory (PM), serving as a supplement and potential replacement for main memory, offers a lower cost for data storage while ensuring data persistence. However, traditional index structures tailored for PM like B+ trees fail to fully exploit the distribution characteristics of data for optimizing reading and writing performance on PM. Recent research endeavors have sought to enhance indexes’ reading and writing performance on PM and support index persistence through the data distribution awareness of learning indexes. Nonetheless, existing designs of persistent learning index structures suffer from additional PM accesses and poor performance when confronted with real-world data. To address the performance degradation of persistent learning indexes in the face of real data distributions, this study proposes a learning index PLTree, a DRAM/PM hybrid architecture. PLTree optimizes reading and writing performance under real data distributions through the following approaches: (1) a two-stage approach to construct the index, eliminating last-mile search in internal nodes and reducing the access of PM, (2) model-based search for efficient query performance on PM and accelerated query by leveraging metadata in DRAM, and (3) a log-based hierarchical overflow buffer structure tailored to PM characteristics to optimize writing performance. The results show that, compared with the existing persistent memory indexes (APEX, FPTree, uTree, NBTree, and DPTree), PLTree achieves significantly better performance in index construction 1.9× to 34× across various datasets. In single-threaded scenarios, PLTree exhibits an average query and insertion performance improvement of 1.26× to 4.45× and 2.63× to 6.83×, respectively. In multi-threaded scenarios, PLTree surpasses the baseline by up to 10.2× and 23.7× in query and insertion performance, respectively.

Key words:learning index;persistent memory (PM);persistent memory index;database

引用本文

张志国,谢钟乐,陈珂,寿黎但. PLTree: 一个高性能持久化内存学习索引.软件学报,2025,36(5):2321-2341

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-12-11
最后修改日期:2024-01-21
录用日期:
在线发布日期: 2024-06-14
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码