copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Image Analytics in Web Archives

E. Müller-Budack, K. Pustu-Iren, S. Diering, M. Springstein, and R. Ewerth. page 141--151. Springer International Publishing, Cham, (2021)
DOI: 10.1007/978-3-030-63291-5_11

Abstract

The multimedia content published on the World Wide Web is constantly growing and contains valuable information in various domains. The Internet Archive initiative has gathered billions of time-versioned web pages since the mid-nineties, but unfortunately, they are rarely provided with appropriate metadata. This lack of structured data limits the exploration of the archives, and automated solutions are required to enable semantic search. While many approaches exploit the textual content of news in the Internet Archive to detect named entities and their relations, visual information is generally disregarded. In this chapter, we present an approach that leverages deep learning techniques for the identification of public personalities in the images of news articles stored in the Internet Archive. In addition, we elaborate on how this approach can be extended to enable detection of other entity types such as locations or events. The approach complements named entity recognition and linking tools for text and allows researchers and analysts to track the media coverage and relations of persons more precisely. We have analysed more than one million images from news articles in the Internet Archive and demonstrated the feasibility of the approach with two use cases in different domains: politics and entertainment.

Description

Image Analytics in Web Archives | SpringerLink

Cite this publication

@inbook{Müller-Budack2021, abstract = {The multimedia content published on the World Wide Web is constantly growing and contains valuable information in various domains. The Internet Archive initiative has gathered billions of time-versioned web pages since the mid-nineties, but unfortunately, they are rarely provided with appropriate metadata. This lack of structured data limits the exploration of the archives, and automated solutions are required to enable semantic search. While many approaches exploit the textual content of news in the Internet Archive to detect named entities and their relations, visual information is generally disregarded. In this chapter, we present an approach that leverages deep learning techniques for the identification of public personalities in the images of news articles stored in the Internet Archive. In addition, we elaborate on how this approach can be extended to enable detection of other entity types such as locations or events. The approach complements named entity recognition and linking tools for text and allows researchers and analysts to track the media coverage and relations of persons more precisely. We have analysed more than one million images from news articles in the Internet Archive and demonstrated the feasibility of the approach with two use cases in different domains: politics and entertainment.}, added-at = {2024-03-04T15:47:23.000+0100}, address = {Cham}, author = {M{\"u}ller-Budack, Eric and Pustu-Iren, Kader and Diering, Sebastian and Springstein, Matthias and Ewerth, Ralph}, biburl = {https://www.bibsonomy.org/bibtex/205a19edc124ac9672a219656f9d2f0ab/ericmb}, booktitle = {The Past Web: Exploring Web Archives}, description = {Image Analytics in Web Archives | SpringerLink}, doi = {10.1007/978-3-030-63291-5_11}, editor = {Gomes, Daniel and Demidova, Elena and Winters, Jane and Risse, Thomas}, interhash = {a25c1ea7aeeedec089dd67e81b756185}, intrahash = {05a19edc124ac9672a219656f9d2f0ab}, isbn = {978-3-030-63291-5}, keywords = {myown}, pages = {141--151}, publisher = {Springer International Publishing}, timestamp = {2024-03-04T15:47:23.000+0100}, title = {Image Analytics in Web Archives}, url = {https://doi.org/10.1007/978-3-030-63291-5_11}, year = 2021 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Image Analytics in Web Archives

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Image Analytics in Web Archives

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Image Analytics in Web Archives

Comments and Reviews
(0)