Data Synchronization Tool for Distributed Heterogeneous Database
Author:
Affiliation:

Fund Project:

National Key Research and Development Program of China (2018YFB1004401); National Natural Science Foundation of China (61732014); Beijing Municipal Science and Technology Project (Z171100005117002)

  • Article
  • | |
  • Metrics
  • |
  • Reference [26]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    In general, the read-write separation technology can solve some of the problems on mismatch between read and write in the current big data environment, but most of the current read-write separation technology are prepared for homogeneous database. Due to the inconsistent storage structure, heterogeneous distributed database systems composed of a row storage database and a columnar storage database will face many difficulties like format conversion and mismatch of synchronization speed in data synchronization compared to a homogeneous distributed database system. This study proposes the use of MySQL binary log to perform the TD-Reduction of SQL. It designs and implements Binlog parser BinParser and Binlog restorer BinReducer, which based on the mixed format. Different events perform log parsing and restoring according to the corresponding rules to generate executable SQL statements. Based on the above techniques, this study has implemented Cynomys, a distributed database data synchronization tool. In the experimental environment, Cynomys has shown sound performance. The method is suitable for data synchronization between all other heterogeneous databases with a similar mechanism like Binlog.

    Reference
    [1] Stonebraker M, Aoki PM, Litwin W, Pfeffer A, Sah A, Sidell J, et al. Mariposa:A wide-area distributed database system. VLDB Journal, 1996,5(1):48-63.
    [2] Chen K, Zhou Y, Cao Y. Online data partitioning in distributed database systems. In:Proc. of the Int'l Conf. on Extending Database Technology (EDBT 2015). 2015. 1-12.
    [3] Corbett JC, Dean J, Epstein M, et al. Spanner:Google's globally-distributed database. In:Proc. of the Usenix Conf. on Operating Systems Design and Implementation, Vol.31. 2012. 251-264.
    [4] Wang J, Zhang DS. Research and design of distributed database synchronization system based on middleware. In:Proc. of the Modern Electronics Technique. 2016. 685-688.
    [5] Lahiri T, Chavan S, Colgan M, et al. Oracle database in-memory:A dual format in-memory database. In:Proc. of the IEEE, Int'l Conf. on Data Engineering. 2016. 1253-1258.
    [6] Mukherjee N, Chavan S, Colgan M, et al. Distributed architecture of oracle database in-memory. Proc. of the VLDB Endowment, 2015,8(12):1630-1641.
    [7] Mukherjee N, Kulkarni K, Jin H, et al. How does oracle database in-memory scale out? In:Proc. of the Int'l Joint Conf. on Software Technologies, Vol.1. 2015. 1-6.
    [8] Färber F, May N, Lehner W, et al. The sap hana database-An architecture overview. Bulletin of the Technical Committee on Data Engineering, 2012,35(1):28-33.
    [9] Wang Z. Research and implementation of load balancing algorithm for offline data migration[MS. Thesis]. Shenyang:Northeastrn University, 2015(in Chinese with English abstract).
    [10] Li GX, Liu S, Liu JC, et al. Research and application of data synchronization service platform based on achived logs. Electric Power Information and Communication Technology, 2010,8(2):31-35(in Chinese with English abstract).
    [11] Song FL. The research and implementation of massive data synchronization system for database based on log parser[MS. Thesis]. Guangzhou:South China University of Technology, 2016(in Chinese with English abstract).
    [12] Lin Y, Chen ZB. Implementation of synchronization system for distributed database. Computer Engineering and Design, 2010, 31(24):5278-5281(in Chinese with English abstract).
    [13] Zheng HM. Research and implementation of heterogeneous database synchronization technology based on SQL restore method. Computer Era, 2008(10):15-18(in Chinese with English abstract).
    [14] Prisco RD, Lampson B, Lynch N. Revisiting the Paxos algorithm. In:Proc. of the Int'l Workshop on Distributed Algorithms, Vol.243. 1997. 111-125.
    [15] Xu JX, Hou ZS. Notes on data-driven system approaches. Acta Automatica Sinica, 2009,35(6):668-675.
    [16] Boncz PA, Zukowski M, et al. MonetDB/X100:Hyper-pipelining query execution. In:Proc. of the Int'l Conf. on Innovation Database Research (CIDR), Vol.5. 2005. 225-237.
    [17] Bouchenak S, Hagimont D, Palma ND. Techniques for implementing efficient Java thread serialization. In:Proc. of the ACS/IEEE Int'l Conf. on Computer Systems and Applications, Vol.34. 2003. 355-393.
    [18] Zeng CH, Zhang JJ, Xiong SF. Design and implementation of P2P remote assistance system based on JXTA. Journal of Jiangxi University of Science and Technology, 2009,30(3):36-40(in Chinese with English abstract).
    [19] Gutierrez F. Messaging with Redis. Berkeley, Apress, 2017. 120-155.
    附中文参考文献:
    [9] 王智.负载均衡的离线数据迁移算法的研究与实现[硕士学位论文].沈阳:东北大学,2015.
    [10] 李功新,刘升,刘金长,等.基于归档日志的数据同步服务平台研究与应用.电力信息与通信技术,2010,8(2):31-35.
    [11] 宋芳利.基于日志解析的数据库海量数据同步系统的研究与实现[硕士学位论文].广州:华南理工大学,2016.
    [12] 林源,陈志泊.分布式异构数据库同步系统的研究与应用.计算机工程与设计,2010,31(24):5278-5281.
    [13] 郑海明.基于SQL还原法的异构数据库同步技术的研究与实现.计算机时代,2008(10):15-18.
    [18] 曾传璜,张晶晶,熊圣芬.基于JXTA的P2P远程协助系统的设计与实现.江西理工大学学报,2009,30(3):36-40.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

徐梓荐,叶盛,张孝.分布式异构数据库数据同步工具.软件学报,2019,30(3):684-699

Copy
Share
Article Metrics
  • Abstract:3956
  • PDF: 7025
  • HTML: 3427
  • Cited by: 0
History
  • Received:July 20,2018
  • Revised:September 20,2018
  • Online: March 06,2019
You are the first2033182Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063