Abstract:Given an AMR (abstract meaning representation) graph, AMR-to-text generation aims to generate text with the same meaning. Related studies show that the performance of AMR-to-text severely suffers from the size of the manually annotated dataset. To alleviate the dependence on manually annotated dataset, this study proposes a novel multi-task pre-training for AMR-to-text generation. In particular, based on a large-scale automatic AMR dataset, three relevant pre-training tasks are defined, i.e., AMR denoising auto-encoder, sentence denoising auto-encoder, and AMR-to-text generation itself. In addition, to fine-tune the pre-training models, the vanilla fine-tuning method is further extended to multi-task learning fine-tuning, which enables the final model to maintain performance on both AMR-to-text and pre-training tasks. With the automatic dataset of 0.39M sentences, detailed experimentation on two AMR benchmarks shows that the proposed pre-training approach significantly improves the performance of AMR-to-text generation, with the improvement of 12.27 BLEU on AMR2.0 and 7.57 on AMR3.0, respectively. This greatly advances the state-of-the-art performance with 40.30 BLEU on AMR2.0 and 38.97 on AMR 3.0, respectively. To the best knowledge, this is the best result achieved so far on AMR 2.0 while AMR-to-text generation performance on AMR 3.0 is firstly reported.