学习型索引通过学习数据分布可以准确地预测数据存取的位置,可在保持高效稳定的查询下,显著降低索引的内存占用.现有的学习型索引主要针对只读查询进行优化,而对插入和更新支持不足.本文针对上述挑战,设计了一种基于Radix Tree的工作负载自适应学习型索引ALERT.ALERT使用Radix Tree来管理不定长的分段,段内采用具有最大误差界的线性插值模型进行预测.同时,ALERT使用一种高效的插入缓冲来降低数据插入更新的代价.本文针对点查询和范围查询提出两种自适应重组优化方法,通过对工作负载进行感知,动态地调整插入缓冲的组织结构.实验验证,ALERT与业界流行的学习型索引相比,构建时间平均降低了81%,内存占用平均降低了75%,在保持了优秀读性能的同时,使插入延迟平均降低了50%;此外,ALERT使用自适应重组优化能有效感知查询工作负载特征,与不使用自适应重组优化相比,查询延迟平均降低了15%.
Learned indexes are capable of predicting the accurate location of data in storage by learning the data distribution. These indexes can significantly reduce storage consumption while providing efficient query processing. Existing learned indexes are mostly optimized for read-only queries, but inadequate in supporting insertions and updates. In an attempt to address the challenges faced by learned index, this paper proposes a workload-adaptive learned index named ALERT. Generally, ALERT employs a Radix Tree to manage variable-length segments, where each segment contains a linear interpolation model with a maximum error-bound. Meanwhile, ALERT utilizes an insertion memory buffer to reduce the cost of updates. Following the database-cracking approach, the paper proposes adaptive index maintenance during the run-time processing of point queries and range queries. The maintenance technique is implemented by performing workload-aware dynamic re-organization on the insertion buffer. Experimental results confirm that, when compared to state-of-the-art learned index, ALERT achieves competitive results as it reduces the index the average construction time by 81%, the average memory utilization by 75%, the average latency of insert by 50%, while maintaining competitive read performances. The average query latency of ALERT is also reduced by 15%, owing to its effective workload-aware optimization.
陈井爽,陈珂,寿黎但,江大伟,陈刚. ALERT:基于Radix Tree的工作负载自适应学习型索引.软件学报,,():0复制