Abstract:With the development of Internet information technology, large-scale graphs have widely emerged in social networks, computer networks, and biological information networks. In view of the storage and performance limitations of traditional graph data management technology when dealing with large-scale graphs, distributed management technology has become a hotspot in industry and academia fields. The core decomposition is adopted to get core numbers of vertices in a graph and plays a key role in many applications, including community search, protein structure analysis, and network structure visualization. The existing distributed core decomposition algorithm applied a broadcast message delivery mechanism based on the vertex-centric mode, which may generate a large amount of redundant communication and computation overhead and lead to memory overflow when processing large-scale graphs. To address these issues, this study proposes novel distributed core decomposition algorithms based on global activation and peeling calculation frameworks, respectively. In addition, there are several strategies designed to improve algorithm performance. Based on the locality of the vertex core number, the study proposes a new message-pruning strategy and a new worker-centric computing mode, thereby improving the efficiency of our algorithms. To verify those strategies, this study deploys the proposed models and algorithms on the distributed cluster of the National Supercomputing Center in Changsha, and the effectiveness and efficiency of the proposed methods are evaluated through a large number of experiments on real and synthetic data sets. The total time performance of the algorithm is improved by 37% to 98%.