Inproceedings,

Named entity recognition for Amharic using deep learning

, and .
2017 IST-Africa Week Conference (IST-Africa), page 1-8. (May 2017)
DOI: 10.23919/ISTAFRICA.2017.8102402

Abstract

The paper describes a named entity recognition system for Amharic, an under-resourced language, using a recurrent neural network, a bi-directional long short term memory model to identify and classify tokens into six predefined classes: Person, Location, Organization, Time, Title, and Other (non-named entity tokens). Word vectors based on semantic information are built for all tokens using an unsupervised learning algorithm, word2vec. The word vectors were merged with a set of specifically developed language independent features and together fed to the neural network model to predict the classes of the words. When evaluated by 10-fold cross-validation, the created Amharic named entity recogniser achieved good average precision (77.2%), but did worse on recall (63.4%), for a 69.7% F1-score.

Tags

Users

  • @asmelash

Comments and Reviews