String representations and distances in deep Convolutional Neural Networks for image classification

Abstract : Recent advances in image classification mostly rely on the use of powerful local features combined with an adapted image representation. Although Convolutional Neural Network (CNN) features learned from ImageNet were shown to be generic and very efficient, they still lack of flexibility to take into account variations in the spatial layout of visual elements. In this paper, we investigate the use of structural representations on top of pre-trained CNN features to improve image classification. Images are represented as strings of CNN features. Similarities between such representations are computed using two new edit distance variants adapted to the image classification domain. Our algorithms have been implemented and tested on several challenging datasets, 15Scenes, Caltech101, Pas-cal VOC 2007 and MIT indoor. The results show that our idea of using structural string representations and distances clearly improves the classification performance over standard approaches based on CNN and SVM with linear kernel, as well as other recognized methods of the literature.
Complete list of metadatas

Cited literature [32 references]  Display  Hide  Download

https://hal-ujm.archives-ouvertes.fr/ujm-01274675
Contributor : Christophe Ducottet <>
Submitted on : Tuesday, February 16, 2016 - 10:10:14 AM
Last modification on : Thursday, July 26, 2018 - 1:10:44 AM
Long-term archiving on : Tuesday, May 17, 2016 - 10:05:21 AM

File

Barat2016-String-preprint.pdf
Files produced by the author(s)

Identifiers

Citation

Cécile Barat, Christophe Ducottet. String representations and distances in deep Convolutional Neural Networks for image classification. Pattern Recognition, Elsevier, 2016, 54, pp.104-115. ⟨10.1016/j.patcog.2016.01.007⟩. ⟨ujm-01274675⟩

Share

Metrics

Record views

182

Files downloads

1406