图数据, 如引文网络, 社交网络和交通网络, 广泛地存在现实生活中. 图神经网络凭借强大的表现力受到广泛关注, 在各种各样的图分析应用中表现卓越. 然而, 图神经网络的卓越性能得益于标签数据和复杂的网络模型, 而标签数据获取困难且计算资源代价高昂. 为了解决数据标签的稀疏性和模型计算的高复杂性问题, 知识蒸馏被引入到图神经网络中. 知识蒸馏是一种利用性能更好的大模型(教师模型)的软标签监督信息, 来训练构建的小模型(学生模型), 以期达到更好的性能和精度. 因此, 如何面向图数据应用知识蒸馏技术成为重大研究挑战, 但目前尚缺乏对于图知识蒸馏研究的综述. 旨在对面向图的知识蒸馏进行全面综述, 首次系统地梳理现有工作, 弥补该领域缺乏综述的空白. 具体而言, 首先介绍图和知识蒸馏背景知识; 然后, 全面梳理3类图知识蒸馏方法, 分别是面向深度神经网络的图知识蒸馏、面向图神经网络的图知识蒸馏和基于图知识的模型自蒸馏方法, 并对每类方法进一步划分为基于输出层、基于中间层和基于构造图知识方法; 随后, 分析比较各类图知识蒸馏算法的设计思路, 结合实验结果总结各类算法的优缺点; 此外, 还列举图知识蒸馏在计算机视觉、自然语言处理、推荐系统等领域的应用; 最后对图知识蒸馏的发展进行总结和展望. 还将整理的图知识蒸馏相关文献公开在GitHub平台上, 具体参见: https://github.com/liujing1023/Graph-based-Knowledge-Distillation.
Graph data, such as citation networks, social networks, and transportation networks, exist widely in the real world. Graph neural networks (GNNs) have attracted extensive attention due to their strong expressiveness and excellent performance in a variety of graph analysis applications. However, the excellent performance of GNNs benefits from label data which are difficult to obtain, and complex network models with high computational costs. Knowledge distillation (KD) is introduced into the GNNs to address the labeled data scarcity and high complexity of GNNs. KD is a method of training constructed small models (student models) by soft-label supervision information from larger models (teacher models) to yield better performance and accuracy. Therefore, how to apply the KD technology to graph data has become a research challenge, but there is still a lack of a graph-based KD research review. Aiming at providing a comprehensive overview of KD based on graphs, this study first summarizes the existing studies and fills in the review gap in this field. Specifically, this study first introduces the background knowledge of graph and KD. Then, three types of graph-based knowledge distillation methods are comprehensively summarized, including graph knowledge distillation for deep neural networks (DNNs), graph knowledge distillation for GNNs, and self-KD-based graph knowledge distillation. Furthermore, each type of method is further divided into knowledge distillation methods based on the output layer, the middle layer, and the constructed graph. Subsequently, the design ideas of various graph-based knowledge distillation algorithms are analyzed and compared, and the advantages and disadvantages of the algorithms are concluded with experimental results. In addition, the application of graph-based knowledge distillation in computer vision, natural language processing, recommendation systems, and other fields are also listed. Finally, the development of graph-based knowledge distillation is summarized and prospected. This study also discloses the references related to graph-based knowledge distillation on GitHub. Please refer to https://github.com/liujing1023/Graph-based-Knowledge-Distillation.