[关键词]
[摘要]
图是描述实体间关系的重要数据结构,被广泛地应用于信息科学、物理学、生物学、环境生态学等重要的科学领域.现如今,随着图数据规模的不断增大,利用分布式系统来处理大图数据已经成为主流,出现了形如Pregel、GraphX、PowerGraph和Gemini等经典的分布式大图数据处理系统.然而,与当前先进的基于单机的图处理系统相比,这些经典的分布式图处理系统在处理真实的图数据时并没有充足或稳定的性能优势.分析了几个有代表性的分布式图处理系统,总结并归纳出了影响其性能的主要挑战.通过对这些挑战的深入研究,提出了RGraph——一个基于RDMA的高效分布式大图数据处理系统.RGraph旨在通过充分利用RDMA的优势来提升图处理系统多个方面的性能.在图划分方面,RGraph采用基于块的划分方式避免破坏原始图数据的局部性,从而保证顶点的高效访问.在负载方面,RGraph提出了基于RDMA单边READ的任务迁移机制和线程间细粒度的任务抢夺方式来分别保证计算节点间以及计算节点内线程间的动态负载均衡,确保集群中的所有计算资源能够被充分利用.在通信方面,RGraph通过对IB verbs的有效封装,实现了符合图计算语义的多线程RDMA通信模型.相比于传统的MPI,RGraph的通信机制可以减少计算节点间2.1倍以上的通信延迟.最后,利用5个真实大图数据集和1个合成数据集,在拥有8个计算节点的高性能集群上测试了RGraph.实验结果表明,RGraph具有明显的性能优势.相比于Powergraph,RGraph具有10.1-16.8倍的加速比,与当前最先进的分布式图处理系统相比,RGraph的加速比仍能达到2.89-5.12倍.同时,RGraph在极度偏斜的幂律图上也能保证稳定的性能优势.
[Key word]
[Abstract]
Graph is a significant data structure which describes the relationship between entries, and it is widely used in information science, physics, biology, environmental ecology and other scientific fields. Nowadays, with the growing magnitude of graph data, processing large-scale graph data using distributed system has become the popular, many specialized distributed systems, including Pregel, GraphX, PowerGraph, and Gemini have been proposed. However, compared with the current state-of-the-art shared-memory graph processing systems, these specialized distributed graph processing systems do not deliver satisfactory or stable performance advantages in processing real-world graph datasets. Several representative distributed graph processing systems are analyzed, and the major challenges that affect their performance are summarized. This study proposes RGraph, an effective distributed graph processing system based on RDMA. The key idea of RGraph is improving performance on top of making full use of the advantages of RDMA. For graph partition, RGraph adopts chunk-based partition to avoid destroying the native locality of the real-world graph, so as to ensure the locality-preserving vertex accesses. For workload, RGraph proposes a task migration mechanism based on RDMA one-side READ and a fine-grained task preemption method among threads to ensure the dynamic load balance for inter-node and intra-node, so that all computing resources can be fully utilized. For communication, RGraph effectively encapsulates IB verbs and implements a concurrent RDMA communication stack satisfied graph computing semantics. Compared with traditional MPI, RGraph’s communication stack can reduce the latency up to 2.1 times for servers’ communication. Finally, five real-world large-scale graph datasets and one synthetic dataset are used to evaluation RGraph on an HPC cluster with eight servers, and the experiment shows that RGraph has obvious performance advantages. Compared with Powergraph, RGraph has 10.1-16.8 times performance improvement. And compared with the existing state-of-the-art CPU- based distributed graph processing system, RGraph still has 2.89-5.12 times performance improvement. Meanwhile, RGraph can still guarantee stable performance advantage on extremely skewed power-law graph.
[中图分类号]
[基金项目]
CCF-华为数据库创新研究计划(DBIR2019007B)