Abstract:In recent years, with the continuous development of computer vision, semantic segmentation and shape completion of 3D scene have been paid more and more attention by academia and industry. Among them, semantic scene completion is emerging research in this field, which aims to simultaneously predict the spatial layout and semantic labels of a 3D scene, and has developed rapidly in recent years. This study classifies and summarizes the methods based on RGB-D images proposed in this field in recent years. These methods are divided into two categories based on whether deep learning is used or not, which include traditional methods and deep learning-based methods. Among them, the methods based on deep learning are divided into two categories according to the input data type, which are the methods based on single depth image and the methods based on RGB-D images. Based on the classification and overview of the existing methods, the relevant datasets used for semantic scene completion task are collated and the experimental results are analyzed. Finally, the challenges and development prospects of this field are summarized.