In this paper we present an efficient and accurate method to aggregate a set of Deep Convolutional Neural Network (CNN) responses, extracted from a set of image windows. CNN features are usually computed on the whole frame or with a dense multi scale approach. There is evidence that using multiple windows yields a better image representation nonetheless it is still not clear how windows should be sam- pled and how CNN responses should be aggregated. Instead of sampling the image densely in scale and space we show that selecting a few hundred windows is enough to obtain an effective image signature. We show how to use Fisher Vectors and PCA to obtain a short and highly descriptive signature that can be used effectively for image retrieval. We test our method on two relevant computer vision tasks: image retrieval and image tagging. We report state-of-the art results for both tasks on three standard datasets.

Fisher Encoded Convolutional Bag-of-Windows for Efficient Image Retrieval and Social Image Tagging

URICCHIO, TIBERIO;
2015-01-01

Abstract

In this paper we present an efficient and accurate method to aggregate a set of Deep Convolutional Neural Network (CNN) responses, extracted from a set of image windows. CNN features are usually computed on the whole frame or with a dense multi scale approach. There is evidence that using multiple windows yields a better image representation nonetheless it is still not clear how windows should be sam- pled and how CNN responses should be aggregated. Instead of sampling the image densely in scale and space we show that selecting a few hundred windows is enough to obtain an effective image signature. We show how to use Fisher Vectors and PCA to obtain a short and highly descriptive signature that can be used effectively for image retrieval. We test our method on two relevant computer vision tasks: image retrieval and image tagging. We report state-of-the art results for both tasks on three standard datasets.
2015
978-1-4673-9711-7
File in questo prodotto:
File Dimensione Formato  
vsm_cnn_fv.pdf

solo utenti autorizzati

Licenza: Copyright dell'editore
Dimensione 2.82 MB
Formato Adobe PDF
2.82 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11393/313317
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 16
social impact