copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Seeing more than whitespace—Tokenisation and disambiguation in a North Sámi grammar checker

L. Wiechetek, K. Unhammer, and S. Moshagen. Workshop on the Use of Computational Methods in the Study of Endangered Languages, 1, page 46. ComputEL, (2019)

Abstract

Communities of lesser resourced languages like North Sámi benefit from language tools such as spell checkers and grammar checkers to improve literacy. Accurate error feedback is dependent on well-tokenised input, but traditional tokenisation as shallow preprocessing is inadequate to solve the challenges of real-world language usage. We present an alternative where tokenisation remains ambiguous until we have linguistic context information available. This lets us accurately detect sentence boundaries, multiwords and compound error detection. We describe a North Sámi grammarchecker with such a tokenisation system, and show the results of its evaluation.

Links and resources

BibTeX key: wiechetek2019seeing
entry type: inproceedings
booktitle: Workshop on the Use of Computational Methods in the Study of Endangered Languages
year: 2019
organization: ComputEL
pages: 46
volume: 1
venue: Honolulu, Hawai’i
url: https://computel-workshop.org/wp-content/uploads/2019/02/CEL3_book_papers_draft.pdf#page=58

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Seeing more than whitespace—Tokenisation and disambiguation in a North Sámi grammar checker

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Seeing more than whitespace—Tokenisation and disambiguation in a North Sámi grammar checker

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Seeing more than whitespace—Tokenisation and disambiguation in a North Sámi grammar checker

Comments and Reviews
(0)