Neural Machine Translation by Jointly Learning to Align and Translate
D. Bahdanau, K. Cho, and Y. Bengio. (2014)cite arxiv:1409.0473Comment: Accepted at ICLR 2015 as oral presentation.
Abstract
Neural machine translation is a recently proposed approach to machine
translation. Unlike the traditional statistical machine translation, the neural
machine translation aims at building a single neural network that can be
jointly tuned to maximize the translation performance. The models proposed
recently for neural machine translation often belong to a family of
encoder-decoders and consists of an encoder that encodes a source sentence into
a fixed-length vector from which a decoder generates a translation. In this
paper, we conjecture that the use of a fixed-length vector is a bottleneck in
improving the performance of this basic encoder-decoder architecture, and
propose to extend this by allowing a model to automatically (soft-)search for
parts of a source sentence that are relevant to predicting a target word,
without having to form these parts as a hard segment explicitly. With this new
approach, we achieve a translation performance comparable to the existing
state-of-the-art phrase-based system on the task of English-to-French
translation. Furthermore, qualitative analysis reveals that the
(soft-)alignments found by the model agree well with our intuition.
Description
[1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate
%0 Generic
%1 bahdanau2014neural
%A Bahdanau, Dzmitry
%A Cho, Kyunghyun
%A Bengio, Yoshua
%D 2014
%K attention mlnlp neuralnet rnn
%T Neural Machine Translation by Jointly Learning to Align and Translate
%U http://arxiv.org/abs/1409.0473
%X Neural machine translation is a recently proposed approach to machine
translation. Unlike the traditional statistical machine translation, the neural
machine translation aims at building a single neural network that can be
jointly tuned to maximize the translation performance. The models proposed
recently for neural machine translation often belong to a family of
encoder-decoders and consists of an encoder that encodes a source sentence into
a fixed-length vector from which a decoder generates a translation. In this
paper, we conjecture that the use of a fixed-length vector is a bottleneck in
improving the performance of this basic encoder-decoder architecture, and
propose to extend this by allowing a model to automatically (soft-)search for
parts of a source sentence that are relevant to predicting a target word,
without having to form these parts as a hard segment explicitly. With this new
approach, we achieve a translation performance comparable to the existing
state-of-the-art phrase-based system on the task of English-to-French
translation. Furthermore, qualitative analysis reveals that the
(soft-)alignments found by the model agree well with our intuition.
@misc{bahdanau2014neural,
abstract = {Neural machine translation is a recently proposed approach to machine
translation. Unlike the traditional statistical machine translation, the neural
machine translation aims at building a single neural network that can be
jointly tuned to maximize the translation performance. The models proposed
recently for neural machine translation often belong to a family of
encoder-decoders and consists of an encoder that encodes a source sentence into
a fixed-length vector from which a decoder generates a translation. In this
paper, we conjecture that the use of a fixed-length vector is a bottleneck in
improving the performance of this basic encoder-decoder architecture, and
propose to extend this by allowing a model to automatically (soft-)search for
parts of a source sentence that are relevant to predicting a target word,
without having to form these parts as a hard segment explicitly. With this new
approach, we achieve a translation performance comparable to the existing
state-of-the-art phrase-based system on the task of English-to-French
translation. Furthermore, qualitative analysis reveals that the
(soft-)alignments found by the model agree well with our intuition.},
added-at = {2018-06-20T16:45:50.000+0200},
author = {Bahdanau, Dzmitry and Cho, Kyunghyun and Bengio, Yoshua},
biburl = {https://www.bibsonomy.org/bibtex/2713375898fd7d2477f6ab6dc3dd66c2c/albinzehe},
description = {[1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate},
interhash = {bb2ca011eeafccb0bd2505c9476dcd10},
intrahash = {713375898fd7d2477f6ab6dc3dd66c2c},
keywords = {attention mlnlp neuralnet rnn},
note = {cite arxiv:1409.0473Comment: Accepted at ICLR 2015 as oral presentation},
timestamp = {2018-06-20T16:45:50.000+0200},
title = {Neural Machine Translation by Jointly Learning to Align and Translate},
url = {http://arxiv.org/abs/1409.0473},
year = 2014
}