Abstract:Log merge tree (LSM-tree)-based key-value storage is widely used in many applications due to its excellent read and write performance. Most existing LSM-trees utilize a multi-level structure to store data. Although the multi-level data structure can serve moderately write-intensive applications well, this structure is not well suited for highly write-intensive applications. This is because storing data in multi-levels introduces the write amplification problem, where new data insertion triggers the reorganization of a large portion of the data already stored in multiple levels. This huge (and sometimes frequent) data reorganization is expensive and degrades write performance in many highly write-intensive applications. In addition, the multi-level structure does not provide consistently excellent read performance for hot data. This is because the multi-level structure cannot optimize the read operation of hot data by merging overlapping ranges in a timely manner. To address the above two challenges, this study proposes LazyStore, a novel single-level LSM-tree based on a hybrid storage architecture. LazyStore solves the write amplification problem by storing data in a single logical level instead of multiple logical levels. As a result, expensive multi-level data reorganization is largely eliminated. To further improve write performance, LazyStore distributes data at the logical level to multiple storage devices, such as DRAM, NVM, and SSD, based on the capacity and read/write performance of each storage device. Furthermore, LazyStore introduces real-time merge operations to improve the read performance of hot data ranges. Experiments show that LazyStore improves write performance by 3 times and reduces write amplification by nearly 4 times compared to other multi-level LSM-trees. For hot range reads, LazyStore’s real-time data merge optimization can reduce the latency of range query processing by a factor of two.