TANG Hai-Bo , ZHANG Huan , ZHANG Zhao , JIN Che-Qing , ZHOU Ao-Ying
2025, 36(3):1-25. DOI: 10.13328/j.cnki.jos.007276
Abstract:Cloud-native databases leverage cloud infrastructure to provide highly available and elastically scalable data management, and they have experienced rapid development in recent years. As a transparent, tamper-proof, and traceable database system, blockchain sharding is the most direct and promising solution for scaling up blockchain systems. By taking advantage of the elastic scalability of cloud infrastructure, more flexible scaling can be achieved. This study first summarizes three key technical challenges addressed by current blockchain sharding: the security of node partitioning, efficient on-chain data sharding, and cross-shard transaction processing. It reviews the research status of these three issues, introduces and compares the corresponding solutions for each issue, and also discusses the new challenges these solutions face in cloud-native environments. Then, around these three dimensions, a comprehensive analysis and comparison of all solutions are conducted from the perspective of the overall impact on the blockchain system. Finally, the paper analyzes the development trends in blockchain sharding technology and points out several research directions that deserve further exploration.
LIU Xing-Yu , SONG Shao-Xu , HUANG Xiang-Dong , WANG Jian-Min
2025, 36(3):1-21. DOI: 10.13328/j.cnki.jos.007277
Abstract:Time-series data are widely used in fields such as industrial manufacturing, meteorology, electric power, and vehicles, which has spurred the development of time-series database management systems. More and more database systems are migrating to the cloud, and the architecture of end-cloud collaboration is becoming more common, leading to increasingly large data scales to be processed. In scenarios such as end-cloud collaboration and massive time series, a large number of short time series are generated due to short synchronization cycles and frequent data flushing, among other reasons, presenting new challenges to database systems. Efficient data management and compression methods can significantly improve storage performance, enabling database systems to handle the storage of massive time series. Apache TsFile is a columnar storage file format specifically designed for time series scenarios, playing an important role in database management systems such as Apache IoTDB. This study elaborates on the group compression and merging methods used in Apache TsFile to address scenarios with a large number of short time series, especially in application scenarios with a vast number of time series such as the Industrial Internet of Things. This group compression method fully considers the data characteristics in the short time series scenario. Through device grouping, it improves metadata utilization, reduces file index size, decreases short time series, and significantly improves compression effectiveness. After validation with real-world datasets, the proposed grouping method shows significant improvements in compression effect, reading, writing, file merging, and other aspects, enabling better management of TsFiles in short time series scenarios.
YAN Yu , DAI Zhi-Yu , LYU Ze-Kai , WANG Hong-Zhi
2025, 36(3):1-20. DOI: 10.13328/j.cnki.jos.007278
Abstract:In recent years, with the development of software and hardware, migrating databases to the cloud has become an emerging development trend and can reduce database operation and maintenance costs for small and medium-sized enterprises and individual users. Furthermore, the development of cloud databases has led to a massive market demand for database operation and maintenance. Researchers have proposed many database self-tuning technologies to support automatic optimization of database knobs. To improve tuning efficiency, existing technologies have shifted from focusing solely on the tuning problem itself to focusing on how to reuse historical experience to find the optimal parameter configuration for the current database instance. However, with the development of cloud databases, users have gradually increased their requirements for privacy protection, hoping to avoid privacy leakage while having efficient data access efficiency. Existing methods do not consider protecting the privacy of users’ historical tuning experience, which may cause user load characteristics to be perceived, causing economic losses. This study analyzes the characteristics of cloud database tuning tasks in detail, organically combines the server side and the user side, and proposes a cloud database knob tuning technology based on federated learning. First, to solve data heterogeneity in federated learning, this study proposes an experience screening method based on meta-feature matching to eliminate historical experiences with large differences in data distribution in advance to improve the efficiency of federated learning. To protect user privacy, this study organically combines the characteristics of cloud database services and proposes a federated Bayesian optimization algorithm with the node end as the training center. Through random Fourier features, it achieves user privacy protection without distorting the tuning experience. The results on extensive public benchmarks present that the proposed method could achieve competitive tuning performance compared with existing tuning methods. Moreover, due to the reuse of historical experience, it can greatly improve tuning efficiency.
YIN Yu-Jie , SHI Hao-Yang , FAN Zi-Hao , ZHOU Hua-Hui , LIU Sheng-Chi , HU Hui-Qi , WEI Xing , CHEN He-Dui , TU Yao-Feng , CAI Peng , ZHOU Xuan
2025, 36(3):1-19. DOI: 10.13328/j.cnki.jos.007279
Abstract:Single-master multi-slave is the mainstream architecture of cloud-native databases. In the cluster, slave nodes can share the read-only requests of the master node, while write requests are handled by the master node. Based on this, to further meet the demands of large-scale transaction expansion, some cloud databases attempt to implement multi-write transaction expansion. One possible approach to multi-write expansion is to introduce shared cache among computing nodes to support cross-node data access. For shared-cache database systems, the overhead of cross-node remote access is significantly higher than that of local access. Therefore, the design of cache protocol is a crucial factor that affects system performance and scalability. This study proposes two innovative improvements to the coherence protocol and implements PG-RAC, a shared-cache database, which supports multi-write transactions based on PostgreSQL. On one hand, PG-RAC proposes a new distributed chained routing strategy, which disperses routing information among computing nodes. Compared to the routing strategy that utilizes single-node directory management, it reduces the average transaction latency by approximately 20%. On the other hand, this study also enhances the duplicate page invalidation mechanism by separating invalidation operations from the transaction path, reducing the latency of the critical path in the transaction. Based on this, PG-RAC takes advantage of the characteristics of multi-version concurrency control (MVCC) and further proposes to delay the invalidation point of duplicate pages, which effectively improves cache utilization. TPC-C experimental results show that for a cluster with 4 compute nodes, the throughput is nearly 2 times that of PostgreSQL and 1.5 times that of the distributed database Citus.
XU Hai-Yang , LIU Hai-Long , CHEN Xian , WANG Lei , JIN Ke , HOU Shu-Feng , LI Zhan-Huai
2025, 36(3):1-14. DOI: 10.13328/j.cnki.jos.007280
Abstract:One of the most important features of multi-tenant databases in cloud environments is scalability. However, most elastic scaling techniques struggle to make effective scaling decisions for dynamically changing loads. If load changes can be predicted in advance, resource supply can be accurately adjusted. Given this, this study proposes a load-prediction-based elastic scaling method for multi-tenant databases. It includes a combined load prediction model and an elastic scaling strategy. The load prediction model combines the advantages of convolutional neural networks, long short-term memory networks and gated recurrent units. It can accurately forecast memory requirements of database clusters. Based on the prediction results, the elastic scaling strategy adjusts the number of virtual machines to ensure that resource supply remains within a reasonable range. Compared to existing methods, the combined load prediction model can reduce prediction errors by 8.7% to 21.8% and improve prediction fitting degree by 4.6%. Furthermore, this study improves the Bayesian optimization algorithm for hyperparameter tuning of the combined prediction model. The improved hyperparameter tuning model reduces errors by above 20% and improves fitting degree by 1.04%, which proves that it can well address the poor performance of Bayesian optimization in combined domains of discrete and continuous solutions. Compared to the most widely used scaling strategy in Kubernetes, the proposed elastic scaling method reduces response time by 8.12% and latency by 9.56%. It can avoid the latency and the waste of resources to a large extent.
HONG Yin-Hao , ZHAO Hong-Yao , WANG Yi-Lin , SHI Xin-Yue , LU Wei , YANG Shang , DU Sheng
2025, 36(3):1-27. DOI: 10.13328/j.cnki.jos.007281
Abstract:Cloud-native databases, with advantages such as out-of-the-box functionality, elastic scalability, and pay-as-you-go, are currently a research hotspot in academia and industry. Currently, cloud-native databases only support “single writer and multiple readers”, that is, read-write transactions are concentrated on a single read-write node, and read-only transactions are distributed to multiple read-only nodes. This limitation restricts the system’s ability to process read-write transactions, making it difficult to meet the demands of write-intensive businesses. To this end, this study proposes the D3C (deterministic concurrency control cloud-native database) architecture. It breaks through the limitation of “single writer and multiple readers” and supports concurrency execution of read-write transactions on multiple read-write nodes by designing a cloud-native database transaction processing mechanism based on deterministic concurrency control. D3C splits transactions into sub-transactions and independently executes them on each node according to a predefined global order, ensuring serializability for transaction execution on multiple read-write nodes. Additionally, this study introduces mechanisms like asynchronous batch data persistence mechanisms based on multi-version to ensure transaction processing performance and proposes a consistency point-based fault recovery mechanism to achieve high availability. Experimental results show that D3C can achieve 5.1 times the performance of the “single writer and multiple readers” architecture in write-intensive scenarios while meeting the key requirements of cloud-native databases.
XIANG Qing-Feng , SHAO Ying-Xia , XU Quan-Qing , YANG Chuan-Hui
2025, 36(3):1-18. DOI: 10.13328/j.cnki.jos.007282
Abstract:Databases are important foundational components in computer services. However, performance anomalies may occur during their operation, affecting business service quality. How to diagnose performance anomalies in databases has become a hot issue in industry and academia. Recently, a series of automated database anomaly diagnosis methods have been successively proposed. They analyze the runtime status of the database and determine the overall database anomaly types. However, with the continuous expansion of data scale, distributed databases are becoming an increasingly popular solution in the industry. In a distributed database, which is composed of multiple nodes, existing anomaly diagnosis methods struggle to effectively locate node anomalies, fail to identify compound anomalies across multiple nodes, and are unable to perceive the complex performance influence relationships between nodes, lacking effective diagnostic capabilities. To address these challenges, this study proposes a distributed database diagnosis method for compound anomalies, named DistDiagnosis. It models the anomalous state of distributed databases using a Compound Anomaly Graph, which not only represents anomalies at each node but also effectively captures the correlations between nodes. DistDiagnosis introduces a node correlation-aware root cause anomaly ranking method, effectively locating root cause anomalies according to the influence of nodes on the database. In this study, anomaly testing cases for various scenarios are constructed on OceanBase, a domestically developed distributed database. Experimental results show that DistDiagnosis outperforms other advanced baselines, achieving the AC@1, AC@3, and AC@5 values of 0.97, 0.98, and 0.98. Compared to the second-best method, DistDiagnosis improves accuracy by up to 5.20%, 5.45%, and 4.46% in each diagnostic scenario.
MA Xu-Yang , ZHOU Xiao-Kai , ZHENG Hao-Yu , CUI Bin , XU Quan-Qing , YANG Chuan-Hui , YAN Xiao , JIANG Jia-Wei
2025, 36(3):1-23. DOI: 10.13328/j.cnki.jos.007283
Abstract:Secure computation of federated multi-party databases can perform federated querying or federated modeling on private data from multiple databases while preserving data privacy. Such a federation is typically a loosely organized group where the participating databases may dropout unexpectedly. However, existing multi-party secure computation systems usually employ privacy-preserving computation schemes like secret sharing, which require participants to remain online, resulting in poor system availability. Moreover, these systems are unable to predict the number of users or request rates when providing services externally. If the system is deployed on a private cluster or rented virtual machines from a cloud computing platform, it will experience increased latency during sudden bursts of requests and resource waste when the request workload is low, leading to poor overall scalability of the system. With the advancement of cloud computing technology, serverless computing has emerged as a new cloud-native deployment paradigm that offers excellent elastic resource scaling. This study designs a system architecture and an indirect communication scheme within the serverless computing framework to architect a highly scalable and highly available multi-party database secure computation system. This system can tolerate database node disconnections and automatically scale system resources in response to user request traffic changes. A system prototype based on Alibaba Cloud and OceanBase database is implemented. Comprehensive experimental comparisons are conducted. The results show that the proposed system outperforms existing systems in terms of computational cost, system performance, and scalability for tasks such as low-frequency queries and horizontal modeling. It can save up to 78% in computational costs and improve system performance by over 1.6 times. The shortcomings of the proposed system for tasks such as complex queries and vertical modeling are analyzed.