copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Entity Extraction and Consolidation for Social Web Content Preservation.

S. Dietze, D. Maynard, E. Demidova, T. Risse, W. Peters, K. Doka, and Y. Stavrakas. 2nd International Workshop on Semantic Digital Archives, volume 912 of CEUR Workshop Proceedings, page 18-29. CEUR-WS.org, (2012)

Abstract

With the rapidly increasing pace at which Web content is evolving, particularly social media, preserving the Web and its evolution over time becomes an important challenge. Meaningful analysis of Web content lends itself to an entity-centric view to organise Web resources according to the information objects related to them. Therefore, the crucial challenge is to extract, detect and correlate entities from a vast number of heterogeneous Web resources where the nature and quality of the content may vary heavily. While a wealth of information extraction tools aid this process, we believe that, the consolidation of automatically extracted data has to be treated as an equally important step in order to ensure high quality and non-ambiguity of generated data. In this paper we present an approach which is based on an iterative cycle exploiting Web data for (1) targeted archiving/crawling of Web objects, (2) entity extraction, and detection, and (3) entity correlation. The long-term goal is to preserve Web content over time and allow its navigation and analysis based on well-formed structured RDF data about entities.

Links and resources

BibTeX key: conf/ercimdl/DietzeMDRPDS12
entry type: inproceedings
booktitle: 2nd International Workshop on Semantic Digital Archives
year: 2012
pages: 18-29
publisher: CEUR-WS.org
series: CEUR Workshop Proceedings
volume: 912
crossref: conf/ercimdl/2012sda
ee: http://ceur-ws.org/Vol-912/paper1.pdf
url: http://dblp.uni-trier.de/db/conf/ercimdl/sda2012.html#DietzeMDRPDS12

@demidova's tags highlighted

Cite this publication

@inproceedings{conf/ercimdl/DietzeMDRPDS12, abstract = {With the rapidly increasing pace at which Web content is evolving, particularly social media, preserving the Web and its evolution over time becomes an important challenge. Meaningful analysis of Web content lends itself to an entity-centric view to organise Web resources according to the information objects related to them. Therefore, the crucial challenge is to extract, detect and correlate entities from a vast number of heterogeneous Web resources where the nature and quality of the content may vary heavily. While a wealth of information extraction tools aid this process, we believe that, the consolidation of automatically extracted data has to be treated as an equally important step in order to ensure high quality and non-ambiguity of generated data. In this paper we present an approach which is based on an iterative cycle exploiting Web data for (1) targeted archiving/crawling of Web objects, (2) entity extraction, and detection, and (3) entity correlation. The long-term goal is to preserve Web content over time and allow its navigation and analysis based on well-formed structured RDF data about entities. }, added-at = {2012-12-05T23:30:38.000+0100}, author = {Dietze, Stefan and Maynard, Diana and Demidova, Elena and Risse, Thomas and Peters, Wim and Doka, Katerina and Stavrakas, Yannis}, biburl = {https://www.bibsonomy.org/bibtex/2120e281fc31faf6b8a2d5a2ac6ed3ef7/demidova}, booktitle = {2nd International Workshop on Semantic Digital Archives}, crossref = {conf/ercimdl/2012sda}, editor = {Mitschick, Annett and Loizides, Fernando and Predoiu, Livia and Nürnberger, Andreas and Ross, Seamus}, ee = {http://ceur-ws.org/Vol-912/paper1.pdf}, interhash = {444586959c487f283fd657dbfa433f43}, intrahash = {120e281fc31faf6b8a2d5a2ac6ed3ef7}, keywords = {arcomem dbpedia enrichment freebase myown terence}, pages = {18-29}, publisher = {CEUR-WS.org}, series = {CEUR Workshop Proceedings}, timestamp = {2013-11-29T22:28:30.000+0100}, title = {Entity Extraction and Consolidation for Social Web Content Preservation.}, url = {http://dblp.uni-trier.de/db/conf/ercimdl/sda2012.html#DietzeMDRPDS12}, volume = 912, year = 2012 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Entity Extraction and Consolidation for Social Web Content Preservation.

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Entity Extraction and Consolidation for Social Web Content Preservation.

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Entity Extraction and Consolidation for Social Web Content Preservation.

Comments and Reviews
(0)