CHEN Zi-Hao
School of Data Science and Engineering, East China Normal University, Shanghai 200062, China;Shanghai Engineering Research Center of Big Data Management (East China Normal University), Shanghai 200062, ChinaXU Chen
School of Data Science and Engineering, East China Normal University, Shanghai 200062, China;Shanghai Engineering Research Center of Big Data Management (East China Normal University), Shanghai 200062, China;Guangxi Key Laboratory of Trusted Software (Guilin University of Electronic Technology), Guilin 541004, ChinaQIAN Wei-Ning
School of Data Science and Engineering, East China Normal University, Shanghai 200062, China;Shanghai Engineering Research Center of Big Data Management (East China Normal University), Shanghai 200062, ChinaZHOU Ao-Ying
School of Data Science and Engineering, East China Normal University, Shanghai 200062, China;Shanghai Engineering Research Center of Big Data Management (East China Normal University), Shanghai 200062, ChinaAs an essential part of big data governance applications, data analysis is characterized by time-consuming and large hardware requirements, making it essential to optimize its execution efficiency. Earlier, data analysts could execute analysis algorithms using traditional matrix computation tools. However, with the explosive growth of data volume, the traditional tools can no longer meet the performance requirements of applications. Hence, distributed matrix computation systems for big data analysis have emerged. This study reviews the progress of distributed matrix computation systems from technical and system perspectives. First, this study analyzes the challenges faced by distributed matrix computation systems in four dimensions:programming interface, compilation optimization, execution engine, and data storage, from the perspective of the mature data management field. Second, this study discusses and summarizes the technologies in each of these four dimensions. Finally, the study investigates the future research and development directions of distributed matrix computation systems.
陈梓浩,徐辰,钱卫宁,周傲英.面向大数据分析的分布式矩阵计算系统研究进展.软件学报,2023,34(3):1236-1258
Copy