Abstract:As a critical task in computer vision and animation, facial reconstruction can provide 3D model structures and rich semantic information for multi-modal facial applications. However, monocular 2D facial images lack depth information and the parameters of the predicted facial model are not reliable, which causes poor reconstruction results. This study proposes to employ facial action unit (AU) and facial keypoints which are highly correlated with model parameters as a bridge to guide the regression of model-related parameters and thus solve the ill-posed monocular facial reconstruction. Based on existing facial reconstruction datasets, this study provides a complete semi-automatic labeling scheme for facial AUs and constructs a 300W-LP-AU dataset. Furthermore, a 3D facial reconstruction algorithm based on AU awareness is put forward to realize end-to-end multi-tasking learning and reduce the overall training difficulty. Experimental results show that it improves the facial reconstruction performance, with high fidelity of the reconstructed facial model.