[关键词]
[摘要]
覆盖模型可以缓解神经机器翻译中的过度翻译和漏翻译问题.现有方法通常依靠覆盖向量或覆盖分数等单一方式存储覆盖信息,而未考虑不同覆盖信息之间的关联性,因此对信息的利用并不完善.针对该问题,基于翻译历史信息的一致性和模型之间的互补性,提出了多覆盖融合模型.首先定义词级覆盖分数概念;然后利用覆盖向量和覆盖分数存储的信息同时指导注意力机制,降低信息存储损失对注意力权重计算的影响.根据两种覆盖信息融合方式的不同,提出了两种多覆盖融合方法.利用序列到序列模型在中英翻译任务上进行了实验,结果表明,所提方法能够显著提升翻译性能,并改善源语言和目标语言的对齐质量.与只使用覆盖向量的模型相比,过度翻译和漏翻译问题的数量得到进一步减少.
[Key word]
[Abstract]
The over-translation and under-translation problem in neural machine translation could be alleviated by coverage model. The existing methods usually store the coverage information in a single way, such as coverage vector or coverage score, but do not take the relationship among different coverage methods into consideration, leading to insufficient use of the information. This study proposes a multi-coverage mechanism based on the consistency of translation history and complementarity between different models. The concept of word-level coverage score is defined first, and then the coverage information stored in both coverage vector and coverage score are incorporated into the attention mechanism simultaneously, aiming to reduce the influence of information loss. According to different fusion methods, two models are introduced. Experiments are carried out on Chinese-to-English translation task based on sequence-to-sequence model. Results show that the proposed method could significantly enhance translation performance and improve alignment quality between source and target words. Compared with the model with coverage vector only, the number of over-translation and under-translation problem is further reduced.
[中图分类号]
[基金项目]
国家重点研发计划(2020AAA0108004);国家自然科学基金(61672127,U1936109)