Abstract

In this paper, we propose a novel method for Automatic Text Recognition (ATR) on early printed books. Our approach signi cantly reduces the Character Error Rates (CERs) for book-speci c training when only a few lines of Ground Truth (GT) are available and considerably outperforms previous methods. An ensemble of models is trained simultaneously by optimising each one independently but also with respect to a fused output obtained by averaging the individual con dence matrices. Various experiments on ve early printed books show that this approach already outperforms the current state-of-the-art by up to 20% and 10% on average. Replacing the averaging of the con dence matrices during prediction with a con dence-based voting boosts our results by an additional 8% leading to a total average improvement of about 17%.

Links and resources

Tags

community

  • @chreul
  • @dblp
@chreul's tags highlighted