Middle Layer Based Scalable Learned Index Scheme

doi:10.13328/j.cnki.jos.005910

微信服务号

微信订阅号

2025-4-6- 6

Home > Archive>Volume 31, Issue 3, 2020 >620-633. DOI:10.13328/j.cnki.jos.005910

PDF HTML XML Export Cite reminder

Middle Layer Based Scalable Learned Index Scheme
DOI:
                        10.13328/j.cnki.jos.005910
                    
Author:
                        GAO Yuan-NingGAO Yuan-Ning
Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai 200240, China;Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai 200240, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YE Jin-BiaoYE Jin-Biao
Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai 200240, China;Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai 200240, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YANG Nian-ZuYANG Nian-Zu
Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai 200240, China;Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai 200240, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
GAO Xiao-FengGAO Xiao-Feng
Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai 200240, China;Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai 200240, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
CHEN Gui-HaiCHEN Gui-Hai
Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai 200240, China;Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai 200240, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:National Key Research and Development Program of China (2018YFB1004700); National Natural Science Foundation of China (61872238, 61972254, 61832005); Shanghai Science and Technology Fund (17510740200); CCF-Huawei Database System Innovation Research Plan (CCF-Huawei DBIR2019002A)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

In the era of big data and cloud computing, efficient data access is an important metric to measure the performance of a large-scale storage system. Therefore, design a lightweight and efficient index structure, which can meet the system's demand for high throughput and low memory footprint, is one of the research hotspots in the current database field. Recently, Kraska, et al proposed to use the machine learning models instead of traditional B-tree indexes, and remarkable results are achieved on real data sets. However, the proposed model assumes that the workload is static and read-only, failing to handle the index update problem. This study proposes Dabble, a middle layer based scalable learning index model, which is used to mitigate the index update problem. Dabble first uses K-means algorithm to divide the data set into K regions, and trains K neural networks to learn the data distribution of different regions. During the training phase, it innovatively integrates the data access patterns into the neural network, which can improve the prediction accuracy of the model for hotspot data. For data insertion, it borrows the idea of LSM tree, i.e., delay update mechanism, which greatly improved the data writing speed. In the index update phase, a middle layer based mechanism is proposed for model decoupling, thus easing the problem of index updating cost. Dabble model is evaluated on two datasets, the Lognormal distribution dataset and the real-world Weblogs dataset. The experiment results demonstrate the effectiveness and efficiency of the proposed model compared with the state-of-the-art methods.

Key words:learned index;clustering;neural network;dynamic update

Get Citation

高远宁,叶金标,杨念祖,高晓沨,陈贵海.基于中间层的可扩展学习索引技术.软件学报,2020,31(3):620-633

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 20,2019
Revised:November 25,2019
Adopted:
Online: January 10,2020
Published: March 06,2020

You are the first2033299Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History