Clause restructuring for statistical machine translation
M. Collins, P. Koehn, and I. Kucerová. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, page 531--540. Stroudsburg, PA, USA, Association for Computational Linguistics, (2005)
DOI: 10.3115/1219840.1219906
Abstract
We describe a method for incorporating syntactic information in statistical machine translation systems. The first step of the method is to parse the source language string that is being translated. The second step is to apply a series of transformations to the parse tree, effectively reordering the surface string on the source language side of the translation system. The goal of this step is to recover an underlying word order that is closer to the target language word-order than the original string. The reordering approach is applied as a pre-processing step in both the training and decoding phases of a phrase-based statistical MT system. We describe experiments on translation from German to English, showing an improvement from 25.2% Bleu score for a baseline system to 26.8% Bleu score for the system with reordering, a statistically significant improvement.
Description
Clause restructuring for statistical machine translation
%0 Conference Paper
%1 CollinsEtAl2005
%A Collins, Michael
%A Koehn, Philipp
%A Kucerová, Ivona
%B Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
%C Stroudsburg, PA, USA
%D 2005
%I Association for Computational Linguistics
%K compling machinetranslation syntax
%P 531--540
%R 10.3115/1219840.1219906
%T Clause restructuring for statistical machine translation
%U http://dx.doi.org/10.3115/1219840.1219906
%X We describe a method for incorporating syntactic information in statistical machine translation systems. The first step of the method is to parse the source language string that is being translated. The second step is to apply a series of transformations to the parse tree, effectively reordering the surface string on the source language side of the translation system. The goal of this step is to recover an underlying word order that is closer to the target language word-order than the original string. The reordering approach is applied as a pre-processing step in both the training and decoding phases of a phrase-based statistical MT system. We describe experiments on translation from German to English, showing an improvement from 25.2% Bleu score for a baseline system to 26.8% Bleu score for the system with reordering, a statistically significant improvement.
@inproceedings{CollinsEtAl2005,
abstract = {We describe a method for incorporating syntactic information in statistical machine translation systems. The first step of the method is to parse the source language string that is being translated. The second step is to apply a series of transformations to the parse tree, effectively reordering the surface string on the source language side of the translation system. The goal of this step is to recover an underlying word order that is closer to the target language word-order than the original string. The reordering approach is applied as a pre-processing step in both the training and decoding phases of a phrase-based statistical MT system. We describe experiments on translation from German to English, showing an improvement from 25.2% Bleu score for a baseline system to 26.8% Bleu score for the system with reordering, a statistically significant improvement.},
acmid = {1219906},
added-at = {2011-02-28T10:27:29.000+0100},
address = {Stroudsburg, PA, USA},
author = {Collins, Michael and Koehn, Philipp and Ku\v{c}erov\'{a}, Ivona},
biburl = {https://www.bibsonomy.org/bibtex/22a8a2c1028aaca89ea21562477b90e7d/tmalsburg},
booktitle = {Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics},
description = {Clause restructuring for statistical machine translation},
doi = {10.3115/1219840.1219906},
interhash = {514829c41838054eff4e529badf322ce},
intrahash = {2a8a2c1028aaca89ea21562477b90e7d},
keywords = {compling machinetranslation syntax},
location = {Ann Arbor, Michigan},
numpages = {10},
pages = {531--540},
publisher = {Association for Computational Linguistics},
series = {ACL '05},
timestamp = {2011-02-28T10:27:29.000+0100},
title = {Clause restructuring for statistical machine translation},
url = {http://dx.doi.org/10.3115/1219840.1219906},
year = 2005
}