copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions

S. Baier, Y. Ma, and V. Tresp. page 53--68. Springer International Publishing, Cham, (2017)
DOI: 10.1007/978-3-319-68288-4_4

Abstract

Structured scene descriptions of images are useful for the automatic processing and querying of large image databases. We show how the combination of a statistical semantic model and a visual model can improve on the task of mapping images to their associated scene description. In this paper we consider scene descriptions which are represented as a set of triples (subject, predicate, object), where each triple consists of a pair of visual objects, which appear in the image, and the relationship between them (e.g. man-riding-elephant, man-wearing-hat). We combine a standard visual model for object detection, based on convolutional neural networks, with a latent variable model for link prediction. We apply multiple state-of-the-art link prediction methods and compare their capability for visual relationship detection. One of the main advantages of link prediction methods is that they can also generalize to triples which have never been observed in the training data. Our experimental results on the recently published Stanford Visual Relationship dataset, a challenging real world dataset, show that the integration of a statistical semantic model using link prediction methods can significantly improve visual relationship detection. Our combined approach achieves superior performance compared to the state-of-the-art method from the Stanford computer vision group.

Description

Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions | SpringerLink

Links and resources

BibTeX key: Baier2017
entry type: inbook
address: Cham
booktitle: The Semantic Web -- ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21--25, 2017, Proceedings, Part I
year: 2017
pages: 53--68
publisher: Springer International Publishing
isbn: 978-3-319-68288-4
DOI: 10.1007/978-3-319-68288-4_4
url: https://doi.org/10.1007/978-3-319-68288-4_4

@hotho's tags highlighted

Cite this publication

%0 Book Section %1 Baier2017 %A Baier, Stephan %A Ma, Yunpu %A Tresp, Volker %B The Semantic Web -- ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21--25, 2017, Proceedings, Part I %C Cham %D 2017 %E d'Amato, Claudia %E Fernandez, Miriam %E Tamma, Valentina %E Lecue, Freddy %E Cudré-Mauroux, Philippe %E Sequeda, Juan %E Lange, Christoph %E Heflin, Jeff %I Springer International Publishing %K learning prior relation semantic toread %P 53--68 %R 10.1007/978-3-319-68288-4_4 %T Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions %U https://doi.org/10.1007/978-3-319-68288-4_4 %X Structured scene descriptions of images are useful for the automatic processing and querying of large image databases. We show how the combination of a statistical semantic model and a visual model can improve on the task of mapping images to their associated scene description. In this paper we consider scene descriptions which are represented as a set of triples (subject, predicate, object), where each triple consists of a pair of visual objects, which appear in the image, and the relationship between them (e.g. man-riding-elephant, man-wearing-hat). We combine a standard visual model for object detection, based on convolutional neural networks, with a latent variable model for link prediction. We apply multiple state-of-the-art link prediction methods and compare their capability for visual relationship detection. One of the main advantages of link prediction methods is that they can also generalize to triples which have never been observed in the training data. Our experimental results on the recently published Stanford Visual Relationship dataset, a challenging real world dataset, show that the integration of a statistical semantic model using link prediction methods can significantly improve visual relationship detection. Our combined approach achieves superior performance compared to the state-of-the-art method from the Stanford computer vision group. %@ 978-3-319-68288-4

@inbook{Baier2017, abstract = {Structured scene descriptions of images are useful for the automatic processing and querying of large image databases. We show how the combination of a statistical semantic model and a visual model can improve on the task of mapping images to their associated scene description. In this paper we consider scene descriptions which are represented as a set of triples (subject, predicate, object), where each triple consists of a pair of visual objects, which appear in the image, and the relationship between them (e.g. man-riding-elephant, man-wearing-hat). We combine a standard visual model for object detection, based on convolutional neural networks, with a latent variable model for link prediction. We apply multiple state-of-the-art link prediction methods and compare their capability for visual relationship detection. One of the main advantages of link prediction methods is that they can also generalize to triples which have never been observed in the training data. Our experimental results on the recently published Stanford Visual Relationship dataset, a challenging real world dataset, show that the integration of a statistical semantic model using link prediction methods can significantly improve visual relationship detection. Our combined approach achieves superior performance compared to the state-of-the-art method from the Stanford computer vision group.}, added-at = {2017-10-24T11:09:51.000+0200}, address = {Cham}, author = {Baier, Stephan and Ma, Yunpu and Tresp, Volker}, biburl = {https://www.bibsonomy.org/bibtex/254bf16ded7e59ce9ed6cf811ba466c65/hotho}, booktitle = {The Semantic Web -- ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21--25, 2017, Proceedings, Part I}, description = {Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions | SpringerLink}, doi = {10.1007/978-3-319-68288-4_4}, editor = {d'Amato, Claudia and Fernandez, Miriam and Tamma, Valentina and Lecue, Freddy and Cudr{\'e}-Mauroux, Philippe and Sequeda, Juan and Lange, Christoph and Heflin, Jeff}, interhash = {83c02ed55678c847cff2b936295045e8}, intrahash = {54bf16ded7e59ce9ed6cf811ba466c65}, isbn = {978-3-319-68288-4}, keywords = {learning prior relation semantic toread}, pages = {53--68}, publisher = {Springer International Publishing}, timestamp = {2017-10-24T11:09:51.000+0200}, title = {Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions}, url = {https://doi.org/10.1007/978-3-319-68288-4_4}, year = 2017 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions

Comments and Reviews
(0)