Supervised models for multimodal image retrieval based on visual, semantic and geographic information

Duc Tien Dang-Nguyen, Giulia Boato, Alessandro Moschitti, Francesco G.B. De Natale

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Nowadays, large-scale networked social media need better search technologies to achieve suitable performance. Multimodal approaches are promising technologies to improve image ranking. This is particularly true when metadata are not completely reliable, which is a rather common case as far as user annotation, time and location are concerned. In this paper, we propose to properly combine visual information with additional multi-faceted information, to define a novel multimodal similarity measure. More specifically, we combine visual features, which strongly relate to the image content, with semantic information represented by manually annotated concepts, and geo tagging, very often available in the form of object/subject location. Furthermore, we propose a supervised machine learning approach, based on Support Vector Machines (SVMs), to automatically learn optimized weights to combine the above features. The resulting models is used as a ranking function to sort the results of a multimodal query.

Original languageEnglish
Title of host publication2012 10th International Workshop on Content-Based Multimedia Indexing, CBMI 2012
Pages206-210
Number of pages5
DOIs
Publication statusPublished - 1 Oct 2012
Event2012 10th International Workshop on Content-Based Multimedia Indexing, CBMI 2012 - Annecy, Haute-Savoie, France
Duration: 27 Jun 201229 Jun 2012

Publication series

NameProceedings - International Workshop on Content-Based Multimedia Indexing
ISSN (Print)1949-3991

Other

Other2012 10th International Workshop on Content-Based Multimedia Indexing, CBMI 2012
CountryFrance
CityAnnecy, Haute-Savoie
Period27/6/1229/6/12

    Fingerprint

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Dang-Nguyen, D. T., Boato, G., Moschitti, A., & De Natale, F. G. B. (2012). Supervised models for multimodal image retrieval based on visual, semantic and geographic information. In 2012 10th International Workshop on Content-Based Multimedia Indexing, CBMI 2012 (pp. 206-210). [6269806] (Proceedings - International Workshop on Content-Based Multimedia Indexing). https://doi.org/10.1109/CBMI.2012.6269806