一种面向统计机器翻译的协同权重训练方法

doi:10.3724/SP.J.1001.2012.04208

微信服务号

微信订阅号

2025年5月14日 9:42 星期三

首页 > 过刊浏览>2012年第23卷第12期 >3101-3114. DOI:10.3724/SP.J.1001.2012.04208

PDF HTML阅读 XML下载导出引用引用提醒

一种面向统计机器翻译的协同权重训练方法
DOI:
                        10.3724/SP.J.1001.2012.04208
                    
CSTR:
                        
                    
作者:
                        刘树杰刘树杰
哈尔滨工业大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
在期刊界中查找
在百度中查找
在本站中查找
李志灏李志灏
微软亚洲研究院,北京 100080
在期刊界中查找
在百度中查找
在本站中查找
李沐李沐
微软亚洲研究院,北京 100080
在期刊界中查找
在百度中查找
在本站中查找
周明周明
微软亚洲研究院,北京 100080
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Co-Training Framework for Feature Weight Optimization of Statistic Machine Translation

Author:

LIU Shu-Jie
LIU Shu-Jie
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
在期刊界中查找
在百度中查找
在本站中查找
LI Chi-Ho
LI Chi-Ho
Microsoft Research Asia, Beijing 100080, China
在期刊界中查找
在百度中查找
在本站中查找
LI Mu
LI Mu
Microsoft Research Asia, Beijing 100080, China
在期刊界中查找
在百度中查找
在本站中查找
ZHOU Ming
ZHOU Ming
Microsoft Research Asia, Beijing 100080, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [25]

相似文献

引证文献

资源附件

文章评论

摘要:

分析了统计机器翻译中的特征权重的领域自适应问题,并针对该问题提出了协同的权重训练方法.该方法使用来自不同解码器的译文作为准参考译文,并将其加入到开发集中,使得特征权重的训练过程向测试集所在的领域倾斜.此外,提出了使用最小贝叶斯风险的系统融合方法来选择准参考译文,进一步提高了协同权重训练的性能.实验结果表明,使用最小贝叶斯风险系统融合的协同训练方法,可以在一定程度上解决特征权重的领域自适应问题,并显著地提高了在目标领域内机器翻译结果的质量.

关键词:统计机器翻译;最小错误率训练;领域自适应;协同训练;最小贝叶斯风险系统融合

Abstract:

In this paper, based on the investigation of domain adaptation for feature weight, the study proposes to use a co-training framework to handle domain adaptation for feature weight, i.e. The study uses the translation results from another heterogeneous decoder as pseudo references and adds them to the development data set for minimum error rate training to bias the feature weight to the domain of test data set. Furthermore, the study uses a minimum Bayes- Risk combination for pseudo reference selection, which can pick proper translation results from the translation candidates from both decoders to smooth the training process. Experimental results show that this co-training method with a minimum Bayes-Risk combination can yield significant improvements in target domain.

Key words:statistical machine translation; minimum error rate training; domain adaptation; co-training; minimum Bayes-risk combination

参考文献

[1] Zhao TJ, et al. The Principle of Machine Translation. Harbin: Harbin Institute of Technology Press, 2001 (in Chinese).

[2] Brown P, Pietra S, Peitra V, Mercer R. The mathematics of statistical machine translation: Parameter estimation. ComputationalLinguistics, 1993,19(2):263-311.

[3] Och FJ, Ney H. Discriminative training and maximum entropy models for statistical machine translation. In: Proc. of theAssociation for Computational Linguistics (ACL). 2002. 295-302. [doi: 10.3115/1073083.1073133]

[4] Koehn P, Schroeder J. Experiments in domain adaptation for statistical machine translation. In: Proc. of the 2nd Workshop onStatistical Machine Translation. 2007. 224-227.

[5] Lü YJ, Huang J, Liu Q. Improving statistical machine translation performance by training data selection and optimization. In: Proc.of the Empirical Methods in Natural Language Processing (EMNLP). 2007. 343-350.

[6] Sanchis-Trilles G, Cettolo M. Online language model adaptation via n-gram mixtures for statistical machine translation. In: Proc. ofthe European Association for Machine Translation (EAMT). 2010.

[7] Wu H, Wang H, Zong C. Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. In:Proc. of the Int'l Conf. on Computational Linguistics (COLING). 2008. 993-1000.

[8] Cao J, Lü YJ, Su JS, Liu Q. SMT domain adaptation based on monolingual context information. Journal of Chinese InformationProcessing, 2010,24(6):50-56 (in Chinese with English abstract).

[9] Ueffing N, Haffari G, Sarkar A. Transductive learning for statistical machine translation. In: Proc. of the 2nd Workshop onStatistical Machine Translation. 2007. 25-32.

[10] Li M, Zhao Y, Zhang D, Zhou M. Adaptive development data selection for log-linear model in statistical machine translation. In:Proc. of the Int'l Conf. on Computational Linguistics (COLING). 2010. 662-670.

[11] Wu D. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 1997,23(3):377-403.

[12] Xiong D, Liu Q, Lin S. Maximum entropy based phrase reordering model for statistical machine translation. In: Proc. of theAssociation for Computational Linguistics (ACL). 2006. 521-528. [doi: 10.3115/1220175.1220241]

[13] Chiang W. Hierarchical phrase-based translation. Computational Linguistics, 2007,33(2):201-228. [doi: 10.1162/coli.2007.33.2.201]

[14] Och FJ, Ney H. A systematic comparison of various statistical alignment models. Computational Linguistics, 2003,29(1):19-51.

[doi: 10.1162/089120103321337421]

[15] Och FJ. Minimum error rate training in statistical machine translation. In: Proc. of the Association for Computational Linguistics(ACL). 2003. 160-167. [doi: 10.3115/1075096.1075117]

[16] Blum A, Mitchell T. Combining labeled and unlabeled data with co-training. In: Proc. of the Annual Conf. on Learning Theory(COLT). 1998. 92-100. [doi: 10.1145/279943.279962]

[17] Pierce D, Cardie C. Limitations of co-training for natural language learning from large data sets. In: Proc. of the Empirical Methodsin Natural Language Processing (EMNLP). 2001. 1-9.

[18] Sarkar A. Applying co-training methods to statistical parsing. In: Proc. of the North American Chapter of the Association forComputational Linguistics (NAACL). 2001. 1-8. [doi: 10.3115/1073336.1073359]

[19] Hwa R, Osborne M, Anoop S, Mark S. Corrected co-training for statistical parsers. In: Proc. of the ICML 2003 Workshop on theContinuum from Labeled to Unlabeled Data in Machine Learning and Data Mining. 2003.

[20] Wang W, Zhou Z. Analyzing co-training style algorithms. In: Proc. of the European Conf. on Machine Learning (ECML). 2007.454-465. [doi: 10.1007/978-3-540-74958-5_42]

[21] Liang P, Alexandre B, Dan K, Ben T. An end-to-end discriminative approach to machine translation. In: Proc. of the Int'l Conf. onComputational Linguistics (COLING) and the Association for Computational Linguistics (ACL). Sydney, 2006. 761-768. [doi: 10.3115/1220175.1220271]

[22] Watanabe T, Suzuki J, Tsukada H, Isozaki H. Online large-margin training for statistical machine translation. In: Proc. of theEmpirical Methods in Natural Language Processing (EMNLP) and the Special Interest Group on Natural Language Learning of theAssociation for Computational Linguistics (CoNLL). 2007. 764-773.

[23] Rosti AV, Ayan NF, Xiang B, Matsoukas S, Schwartz R, Dorr BJ. Combining outputs from multiple machine translation systems.In: Proc. of the North American Chapter of the Association for Computational Linguistics (NAACL) and Human LanguageTechnologies (HLT). 2007. 228-235.

[24] Koehn P. Statistical significance tests for machine translation evaluation. In: Proc. of the Empirical Methods in Natural LanguageProcessing (EMNLP). 2004. 388-395.

引用本文

刘树杰,李志灏,李沐,周明.一种面向统计机器翻译的协同权重训练方法.软件学报,2012,23(12):3101-3114

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2011-09-01
最后修改日期:2012-03-15
录用日期:
在线发布日期: 2012-12-05
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码