sign in · help · news · about · deen

BibSonomy ::  publication ::

The blue social bookmark and publication sharing system.
entry of diego_ma:    
(0)
This publication has not been reviewed yet.
rating distribution
average user rating
?
The average rating is computed over all reviews. However, some of them may be invisible to you due to the visibility setting chosen by the reviewers.
(0.0 of 5.0 based on 0 reviews)

Orthographic Case Restoration Using Supervised Learning without Manual Annotation

by: Cheng Niu, Wei Li, Jihong, and Rohini Shrihari
In: International Journal on Artificial Intelligence Tools, Vol. 13, Nr. 1 (2004) , p. 141-156.
Citation format (all formats):

Resources (URL, PDF, PS...)

Abstract

One challenge in text processing is the treatment of case insensitive documents such as speech recognition results. The traditional approach is to re-train a language model excluding case-related features. This paper presents an alternative two-step approach whereby a preprocessing module Step 1 is designed to restore case-sensitive form which is subsequently processed by the original system Step 2. Step 1 is mainly implemented as a Hidden Markov Model trained on a large raw corpus of case sensitive documents. It is demonstrated that this approach i outperforms the feature exclusion approach for named entity tagging, ii leads to limited degradation for parsing, relationship extraction and case insensitive question answering, iii reduces system complexity, and iv has wide applicability: the restored text can be used in both statistical model and rule-based systems.

BibTeX record

Endnote record

a gripper