копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Snowball: extracting relations from large plain-text collections

E. Agichtein, и L. Gravano. Proceedings of the fifth ACM conference on Digital libraries, стр. 85--94. New York, NY, USA, ACM, (2000)
DOI: 10.1145/336597.336644

Аннотация

<par>Text documents often contain valuable structured data that is hidden Yin regular English sentences. This data is best exploited infavailable as arelational table that we could use for answering precise queries or running data mining tasks.We explore a technique for extracting such tables from document collections that requires only a handful of training examples from users. These examples are used to generate extraction patterns, that in turn result in new tuples being extracted from the document collection.We build on this idea and present our <italic>Snowball</italic> system. <italic>Snowball</italic> introduces novel strategies for generating patterns and extracting tuples from plain-text documents.At each iteration of the extraction process, <italic>Snowball</italic> evaluates the quality of these patterns and tuples without human intervention,and keeps only the most reliable ones for the next iteration. In this paper we also develop a scalable evaluation methodology and metrics for our task, and present a thorough experimental evaluation of <italic>Snowball</italic> and comparable techniques over a collection of more than 300,000 newspaper documents.</par>

Описание

Snowball

Линки и ресурсы

ключ BibTeX: snowball
тип записи: inproceedings
адрес: New York, NY, USA
название книги: Proceedings of the fifth ACM conference on Digital libraries
год: 2000
страницы: 85--94
издательство: ACM
серии: DL '00
location: San Antonio, Texas, United States
acmid: 336644
isbn: 1-58113-231-X
numpages: 10
DOI: 10.1145/336597.336644
Document: http://www.mathcs.emory.edu/~eugene/papers/dl00.pdf

тэги

@jil- тэги данного пользователя выделены

Цитировать эту публикацию

@inproceedings{snowball, abstract = {<par>Text documents often contain valuable structured data that is hidden Yin regular English sentences. This data is best exploited infavailable as arelational table that we could use for answering precise queries or running data mining tasks.We explore a technique for extracting such tables from document collections that requires only a handful of training examples from users. These examples are used to generate extraction patterns, that in turn result in new tuples being extracted from the document collection.We build on this idea and present our <italic>Snowball</italic> system. <italic>Snowball</italic> introduces novel strategies for generating patterns and extracting tuples from plain-text documents.At each iteration of the extraction process, <italic>Snowball</italic> evaluates the quality of these patterns and tuples without human intervention,and keeps only the most reliable ones for the next iteration. In this paper we also develop a scalable evaluation methodology and metrics for our task, and present a thorough experimental evaluation of <italic>Snowball</italic> and comparable techniques over a collection of more than 300,000 newspaper documents.</par>}, acmid = {336644}, added-at = {2012-11-04T18:25:26.000+0100}, address = {New York, NY, USA}, author = {Agichtein, Eugene and Gravano, Luis}, biburl = {https://www.bibsonomy.org/bibtex/2ded86c7c83a0efc16dcfc34d5644298a/jil}, booktitle = {Proceedings of the fifth ACM conference on Digital libraries}, description = {Snowball}, doi = {10.1145/336597.336644}, interhash = {42f11ae1df7b57d5e4333a47367ad2b4}, intrahash = {ded86c7c83a0efc16dcfc34d5644298a}, isbn = {1-58113-231-X}, keywords = {based extraction relation seed semi snowball supervised}, location = {San Antonio, Texas, United States}, numpages = {10}, pages = {85--94}, publisher = {ACM}, series = {DL '00}, timestamp = {2013-11-23T20:11:51.000+0100}, title = {Snowball: extracting relations from large plain-text collections}, url = {http://www.mathcs.emory.edu/~eugene/papers/dl00.pdf}, year = 2000 }

искать в

Метаданные

Последнее изменение 11 лет назад
Создан 12 лет назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!