копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Discovering Compound and Proper Nouns

G. Protaziuk, M. Kryszkiewicz, H. Rybinski, и A. Delteil. Proceedings of the RSEISP'07 International Conference on Rough Sets and Emerging Intelligent Systems Paradigms, стр. 505--515. (2007)

Аннотация

The identification of appropriate text tokens (words or sequences of words representing concepts) is one of the most important tasks of text preprocessing and may have great influence on the final results of text analysis. In our paper, we introducea new approach to discovering compound nouns, including proper compound nouns. Our approach combines the data mining methods with shallow lexical analysis. We propose a simple pattern language for specifying grammatical patterns to be satisfied byextracted compound nouns. Our method requires annotating the words with part of speech tags, thus to this extent, it is language-dependent.Based on the data mining GSP algorithm, we propose T-GSP as its modification for extracting frequent text patterns, and in particular, frequent word sequences that satisfy givengrammatical rules. The obtained sequences are regarded as candidates for compound nouns. The experiments have proven veryhigh quality of the method.

Линки и ресурсы

ключ BibTeX: Protaziuk:EtAl:07
тип записи: inproceedings
название книги: Proceedings of the RSEISP'07 International Conference on Rough Sets and Emerging Intelligent Systems Paradigms
год: 2007
страницы: 505--515
url: http://dx.doi.org/10.1007/978-3-540-73451-2_53

тэги

@seandalai- тэги данного пользователя выделены

Цитировать эту публикацию

искать в

Метаданные

Последнее изменение 17 лет назад
Создан 17 лет назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!