sign in · help · news · about · deen

BibSonomy ::  publication ::

The blue social bookmark and publication sharing system.
entry of diego_ma:    
(0)
This publication has not been reviewed yet.
rating distribution
average user rating
?
The average rating is computed over all reviews. However, some of them may be invisible to you due to the visibility setting chosen by the reviewers.
(0.0 of 5.0 based on 0 reviews)

LT TTT --- A Flexible Tokenisation Tool

by: Claire Grover, Colin Matheson, Andrei Mikheev, and Marc Moens
In: Proc. LREC 2000 (2000) .
Citation format (all formats):

Resources (URL, PDF, PS...)

Abstract

We describe LT TTT, a recently developed software system which provides tools to perform text tokenisation and mark-up. The system includes ready-made components to segment text into paragraphs, sentences, words and other kinds of token but, crucially, it also allows users to tailor rule-sets to produce mark-up appropriate for particular applications. We present three case studies of our use of LT TTT: named-entity recognition MUC-7, citation recognition and mark-up and the preparation of a corpus in the medical domain. We conclude with a discussion of the use of browsers to visualise marked-up text.

BibTeX record

Endnote record

a gripper