Zusammenfassung
In this paper, we propose a variety of Long Short-Term Memory (LSTM) based
models for sequence tagging. These models include LSTM networks, bidirectional
LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer
(LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is
the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to
NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model
can efficiently use both past and future input features thanks to a
bidirectional LSTM component. It can also use sentence level tag information
thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or
close to) accuracy on POS, chunking and NER data sets. In addition, it is
robust and has less dependence on word embedding as compared to previous
observations.
Nutzer