Article,

Multimodal Image Retrieval

S. Romberg, R. Lienhart, and E. Hörster.
International Journal of Multimedia Information Retrieval, 1 (1): 31-44 (2012)
DOI: 10.1007/s13735-012-0006-4

Abstract

In this work, we extend the standard single-layer probabilistic Latent Semantic Analysis (pLSA) (Hofmann in Mach Learn 42(1-2):177-196, 2001) to multiple layers. As multiple layers should naturally handle multiple modalities and a hierarchy of abstractions, we denote this new approach multilayer multimodal probabilistic Latent Semantic Analysis (mm-pLSA). We derive the training and inference rules for the smallest possible non-degenerated mm-pLSA model: a model with two leaf-pLSAs and a single top-level pLSA node merging the two leaf-pLSAs. We evaluate this approach on two pairs of different modalities: SIFT features and image annotations (tags) as well as the combination of SIFT and HOG features. We also propose a fast and strictly stepwise forward procedure to initialize the bottom-up mm-pLSA model, which in turn can then be post-optimized by the general mm-pLSA learning algorithm. The proposed approach is evaluated in a query-by-example retrieval task where various variants of our mm-pLSA system are compared to systems relying on a single modality and other ad-hoc combinations of feature histograms. We further describe possible pitfalls of the mm-pLSA training and analyze the resulting model yielding an intuitive explanation of its behaviour.

BibTeX key: RombergLienhartHoerster12ijmir
entry type: article
year: 2012
journal: International Journal of Multimedia Information Retrieval
number: 1
pages: 31-44
volume: 1
file: SpringerLink:2012/.pdf:PDF
issn: 2192-6611
groups: public
intrahash: e4b1626be7b48b9a587de9b90cca3375
DOI: 10.1007/s13735-012-0006-4
timestamp: 2012.05.18
username: flint63

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{RombergLienhartHoerster12ijmir, abstract = {In this work, we extend the standard single-layer probabilistic Latent Semantic Analysis (pLSA) (Hofmann in Mach Learn 42(1-2):177-196, 2001) to multiple layers. As multiple layers should naturally handle multiple modalities and a hierarchy of abstractions, we denote this new approach multilayer multimodal probabilistic Latent Semantic Analysis (mm-pLSA). We derive the training and inference rules for the smallest possible non-degenerated mm-pLSA model: a model with two leaf-pLSAs and a single top-level pLSA node merging the two leaf-pLSAs. We evaluate this approach on two pairs of different modalities: SIFT features and image annotations (tags) as well as the combination of SIFT and HOG features. We also propose a fast and strictly stepwise forward procedure to initialize the bottom-up mm-pLSA model, which in turn can then be post-optimized by the general mm-pLSA learning algorithm. The proposed approach is evaluated in a query-by-example retrieval task where various variants of our mm-pLSA system are compared to systems relying on a single modality and other ad-hoc combinations of feature histograms. We further describe possible pitfalls of the mm-pLSA training and analyze the resulting model yielding an intuitive explanation of its behaviour.}, added-at = {2012-05-30T10:53:07.000+0200}, author = {Romberg, Stefan and Lienhart, Rainer and H\"{o}rster, Eva}, biburl = {https://www.bibsonomy.org/bibtex/2e4b1626be7b48b9a587de9b90cca3375/flint63}, doi = {10.1007/s13735-012-0006-4}, file = {SpringerLink:2012/.pdf:PDF}, groups = {public}, interhash = {a6e0cf7b70cb389a5f62c44f96b73064}, intrahash = {e4b1626be7b48b9a587de9b90cca3375}, issn = {2192-6611}, journal = {International Journal of Multimedia Information Retrieval}, keywords = {v1205 springer paper ai multimedia information retrieval image video pattern recognition tagging learn algorithm test}, number = 1, pages = {31-44}, timestamp = {2018-04-16T12:25:16.000+0200}, title = {Multimodal Image Retrieval}, username = {flint63}, volume = 1, year = 2012 }

BibSonomy

Multimodal Image Retrieval

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on