Abstract:Data replication is an important way to improve the availability of distributed databases. By placing multiple database replicas in different regions, the response speed of local reading and writing operations can be increased. Furthermore, increasing the number of replicas can improve the linear scalability of the read throughput. In view of these advantages, a number of multi-replica distributed database systems have emerged in recent years, including some mainstream systems from the industry such as Google Spanner, CockroachDB, TiDB, and OceanBase, as well as some excellent systems from academia such as Calvin, Aria, and Berkeley Anna. However, these multi-replica databases bring a series of challenges such as consistency maintenance, cross-node transactions, and transaction isolation while providing many benefits. This study summarizes the existing replication architecture, consistency maintenance strategy, cross-node transaction concurrency control, and other technologies. It also analyzes the differences and similarities between several representative multi-replica database systems in terms of distributed transaction processing. Finally, the study builds a cross-region distributed cluster environment on Alibaba Cloud and conducts multiple experiments to study the distributed transaction processing performance of these several representative systems.