Abstract:Well recognized as a primitive data structure, graph is an abstraction of objects and their pairwise connections. There exists a wide spectrum of applications, including semantic web analysis, social network analysis, biological genetic analysis and information retrieval, which can be modeled as graphs. Therefore, it is of great importance to conduct data analysis over these applications. With the development of information technology such as mobile Internet and Internet of things, the scale of graph data is increasing continuously and rapidly. To provide fast analysis over large-scale graph data, Pregel was first proposed as a distributed graph processing framework by Google. Since then, based on Pregel framework, a variety of optimization techniques and systems have been proposed by academic and industrial communities. Through extensive investigation and analysis, this paper first establishes three optimization objectives for the state-of-the-arts solutions to build distributed graph processing systems. Subsequently, it reviews mainstream optimizing techniques for the state-of-the-arts solutions from the perspective of computation granularity, task scheduling, communication mode and load balance. Finally, the paper discusses some open research problems and possible future research directions in this field.