基于多策略原型生成的低资源神经机器翻译
作者:
作者简介:

于志强(1983-),男,博士,主要研究领域为自然语言处理,神经机器翻译;余正涛(1970-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为自然语言处理,神经机器翻译,信息检索;黄于欣(1983-),男,博士,CCF专业会员,主要研究领域为自然语言处理,神经机器翻译,文本摘要;郭军军(1987-),男,博士,副教授,CCF专业会员,主要研究领域为自然语言处理,神经机器翻译,多模态机器翻译;线岩团(1982-),男,副教授,CCF专业会员,主要研究领域为自然语言处理,神经机器翻译

通讯作者:

余正涛,ztyu@hotmail.com

中图分类号:

TP18

基金项目:

国家重点研发计划(2019QY1800); 国家自然科学基金(61732005, 61672271, 61761026, 61762056, 61866020); 云南省重大科技专项(202002AD080001); 云南省高新技术产业专项(201606); 云南省自然科学基金(2018FB104)


Low-resource Neural Machine Translation with Multi-strategy Prototype Generation
Author:
  • YU Zhi-Qiang

    YU Zhi-Qiang

    Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;School of Mathematics and Computer Science, Yunnan Minzu University, Kunming 650500, China;Key Laboratory of Artificial Intelligence in Yunnan Province (Kunming University of Science and Technology), Kunming 650500, China
    在期刊界中查找
    在百度中查找
    在本站中查找
  • YU Zheng-Tao

    YU Zheng-Tao

    Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;Key Laboratory of Artificial Intelligence in Yunnan Province (Kunming University of Science and Technology), Kunming 650500, China
    在期刊界中查找
    在百度中查找
    在本站中查找
  • HUANG Yu-Xin

    HUANG Yu-Xin

    Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;Key Laboratory of Artificial Intelligence in Yunnan Province (Kunming University of Science and Technology), Kunming 650500, China
    在期刊界中查找
    在百度中查找
    在本站中查找
  • GUO Jun-Jun

    GUO Jun-Jun

    Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;Key Laboratory of Artificial Intelligence in Yunnan Province (Kunming University of Science and Technology), Kunming 650500, China
    在期刊界中查找
    在百度中查找
    在本站中查找
  • XIAN Yan-Tuan

    XIAN Yan-Tuan

    Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;Key Laboratory of Artificial Intelligence in Yunnan Province (Kunming University of Science and Technology), Kunming 650500, China
    在期刊界中查找
    在百度中查找
    在本站中查找
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [32]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    资源丰富场景下, 利用相似性翻译作为目标端原型序列, 能够有效提升神经机器翻译的性能. 然而在低资源场景下, 由于平行语料资源匮乏, 导致不能匹配得到原型序列或序列质量不佳. 针对此问题, 提出一种基于多种策略进行原型生成的方法. 首先结合利用关键词匹配和分布式表示匹配检索原型序列, 如未能获得匹配, 则利用伪原型生成方法产生可用的伪原型序列. 其次, 为有效地利用原型序列, 对传统的编码器-解码器框架进行改进. 编码端使用额外的编码器接收原型序列输入; 解码端在利用门控机制控制信息流动的同时, 使用改进的损失函数减少低质量原型序列对模型的影响. 多个数据集上的实验结果表明, 相比基线模型, 所提出的方法能够有效提升低资源场景下的机器翻译性能.

    Abstract:

    In rich-resource scenarios, using similarity translation as the target prototype sequence can improve the performance of neural machine translation. However, in low-resource scenarios, due to the lack of parallel corpus resources, the prototype sequence cannot be matched, or the sequence quality is poor. To address this problem, this study proposes a low-resource neural machine translation approach with multi-strategy prototype generation, and the approach includes two phases. (1) Keyword matching and distributed representation matching are combined to retrieve prototype sequences, and the pseudo prototype generation approach is leveraged to generate available prototype sequences during retrieval failures. (2) The conventional encoder-decoder framework is improved for the effective employment of prototype sequences. The encoder side utilizes additional encoders to receive prototype sequences. The decoder side, while employing a gating mechanism to control information flow, adopts improved loss functions to reduce the negative impact of low-quality prototype sequences on the model. The experimental results on multiple datasets show that the proposed method can effectively improve the translation performance compared with the baseline models.

    参考文献
    [1] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. In: Proc. of the 27th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2014. 3104–3112.
    [2] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. In: Proc. of the 3rd Int’l Conf. on Learning Representations. San Diego: ICLR, 2015. 1–15.
    [3] 李亚超, 熊德意, 张民. 神经机器翻译综述. 计算机学报, 2018, 41(12): 2734–2755. [doi: 10.11897/SP.J.1016.2018.02734]
    Li YC, Xiong DY, Zhang M. A survey of neural machine translation. Chinese Journal of Computers, 2018, 41(12): 2734–2755 (in Chinese with English abstract). [doi: 10.11897/SP.J.1016.2018.02734]
    [4] Zhang JY, Utiyama M, Sumita E, Neubig G, Nakamura S. Guiding neural machine translation with retrieved translation pieces. In: Proc. of the 2018 Conf. of the North American Chapter of the Association for Computational Linguistics. New Orleans: ACL, 2018. 1325–1335.
    [5] Bulté B, Tezcan A. Neural fuzzy repair: Integrating fuzzy matches into neural machine translation. In: Proc. of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 1800–1809.
    [6] Cao Q, Xiong DY. Encoding gated translation memory into neural machine translation. In: Proc. of the 2018 Conf. on Empirical Methods in Natural Language Processing. Brussels: ACL, 2018. 3042–3047.
    [7] Wang YR, Xia YC, Tian F, Gao F, Qin T, Zhai CX, Liu TY. Neural machine translation with soft prototype. In: Proc. of the 33rd Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 567.
    [8] Xu JT, Crego J, Senellart J. Boosting neural machine translation with similar translations. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 1580–1590.
    [9] Vanallemeersch T, Vandeghinste V. Assessing linguistically aware fuzzy matching in translation memories. In: Proc. of the 18th Annual Conf. of the European Association for Machine Translation. Antalya: ACL, 2015. 153–160.
    [10] Bloodgood M, Strauss B. Translation memory retrieval methods. In: Proc. of the 14th Conf. of the European Chapter of the Association for Computational Linguistics. Gothenburg: ACL, 2014. 202–210.
    [11] Pagliardini M, Gupta P, Jaggi M. Unsupervised learning of sentence embeddings using compositional n-gram features. In: Proc. of the 2018 Conf. of the North American Chapter of the Association for Computational Linguistics. New Orleans: ACL, 2018. 528–540.
    [12] Zhu JH, Xia YC, Wu LJ, He D, Qin T, Zhou WG, LI HQ, Liu TY. Incorporating BERT into neural machine translation. In: Proc. of the 8th Int’l Conf. on Learning Representations. Addis Ababa: ICLR, 2020. 26–30.
    [13] Hashimoto TB, Guu K, Oren Y, Liang P. A retrieve-and-edit framework for predicting structured outputs. In: Proc. of the 32nd Int’l Conf. on Neural Information Processing Systems. Montréal: Curran Associates Inc., 2018. 10073–10083.
    [14] Ren S, Wu Y, Liu SJ, Zhou M, Ma S. A retrieve-and-rewrite initialization method for unsupervised machine translation. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 3498–3504.
    [15] He JX, Berg-Kirkpatrick T, Neubig G. Learning sparse prototypes for text generation. In: Proc. of the 34th Neural Information Processing Systems. 2020.
    [16] Hokamp C. Ensembling factored neural machine translation models for automatic post-editing and quality estimation. In: Proc. of the 2nd Conf. on Machine Translation. Copenhagen: ACL, 2017. 647–654.
    [17] Dabre R, Cromieres F, Kurohashi S. Enabling multi-source neural machine translation by concatenating source sentences in multiple languages. arXiv:1702.06135, 2017.
    [18] Dinu G, Mathur P, Federico M, Al-Onaizan Y. Training neural machine translation to apply terminology constraints. In: Proc. of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 3063–3068.
    [19] Song K, Zhang Y, Yu H, Luo WH, Wang K, Zhang M. Code-switching for enhancing NMT with pre-specified translation. In: Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics. Minneapolis: ACL, 2019. 449–459.
    [20] Xia YC, Tian F, Wu LJ, Lin JX, Qin T, Yu NH, Liu TY. Deliberation networks: Sequence generation beyond one-pass decoding. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 1782–1792.
    [21] Zhang XW, Su JS, Qin Y, Liu Y, Ji RR, Wang HJ. Asynchronous bidirectional decoding for neural machine translation. In: Proc. of the 32nd AAAI Conf. on Artificial Intelligence. New Orleans: AAAI, 2018. 5698–5705.
    [22] Zhou L, Zhang JJ, Zong CQ. Synchronous bidirectional neural machine translation. Transactions of the Association for Computational Linguistics, 2019, 7: 91–105. [doi: 10.1162/tacl_a_00256]
    [23] Yang J, Ma SM, Zhang DD, Li ZJ, Zhou M. Improving neural machine translation with soft template prediction. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 5979–5989.
    [24] Qin LB, Ni MH, Zhang Y, Che WX. CoSDA-ML: Multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP. In: Proc. of the 29th Int’l Joint Conf. on Artificial Intelligence. Yokohama: IJCAI.org, 2021. 533.
    [25] Sennrich R, Zhang B. Revisiting low-resource neural machine translation: A case study. In: Proc. of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 211–221.
    [26] Papineni K, Roukos S, Ward T, Zhu WJ. BLEU: A method for automatic evaluation of machine translation. In: Proc. of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia: ACL, 2002. 311–318.
    [27] Isozaki H, Hirao T, Duh K, Sudoh K, Tsukada H. Automatic evaluation of translation quality for distant language pairs. In: Proc. of the 2010 Conf. on Empirical Methods in Natural Language Processing. Cambridge: ACL, 2010. 944–952.
    [28] Koehn P. Statistical significance tests for machine translation evaluation. In: Proc. of the 2004 Conf. on Empirical Methods in Natural Language Processing. Barcelona: ACL, 2004. 388–395.
    [29] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.
    [30] Ding CC, Utiyama M, Sumita E. Similar southeast Asian languages: Corpus-based case study on Thai-Laotian and Malay-Indonesian. In: Proc. of the 3rd Workshop on Asian Translation. Osaka: The COLING 2016 Organizing Committee, 2016. 149–156.
    [31] Singvongsa K, Seresangtakul P. Lao-Thai machine translation using statistical model. In: Proc. of the 13th Int’l Joint Conf. on Computer Science and Software Engineering. Khon Kaen: IEEE, 2016. 1–5.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

于志强,余正涛,黄于欣,郭军军,线岩团.基于多策略原型生成的低资源神经机器翻译.软件学报,2023,34(11):5113-5125

复制
分享
文章指标
  • 点击次数:505
  • 下载次数: 2150
  • HTML阅读次数: 1232
  • 引用次数: 0
历史
  • 收稿日期:2021-04-14
  • 最后修改日期:2021-06-28
  • 在线发布日期: 2023-04-27
  • 出版日期: 2023-11-06
文章二维码
您是第19754386位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号