A comprehensive representation model for handwriting dedicated to word spotting

Abstract : In this paper, we propose an original representation model for handwriting document images. Most state-of-the-art handwriting representation models only use separately textural properties, selective dominant features (such as stroke orientation or gradient orientation) or structural properties. To avoid the drawbacks of using the properties from a single aspect, we design a comprehensive model that contains both morphological and topological information of handwriting. After interest points (the starting/ending points, branch points and high-curved points) are selected, an adapted version of Shape Context (SC) descriptor built on the interest points is employed to describe the contour of the text. In order to model the structural characteristics of the handwritten text, a graph is constructed based on the interest points and the skeleton of the text. With the graph, loops and specific strokes in the handwriting are detected and analyzed. Based on this model, a coarse-to-fine approach for word spotting application is introduced. Without segmenting texts into words, a group of regions of interest are selected by comparing textural features (orientation, projection profile, upper and lower border projection) using the DTW method. Afterwards, regions of interest and queries are represented by the proposed model. The final similarity measure is a weighted mixture of the SC cost, loop difference, stroke analysis and texture comparison with different weights. The validation of the model shows the significance of combining the various properties of the handwriting envisaged in its different aspects.
Type de document :
Communication dans un congrès
International Conference on Document Analysis and Recognition (ICDAR), Aug 2013, Washington, United States
Liste complète des métadonnées

https://hal-ujm.archives-ouvertes.fr/ujm-00870693
Contributeur : Christine Largeron <>
Soumis le : lundi 7 octobre 2013 - 19:41:06
Dernière modification le : vendredi 10 novembre 2017 - 01:20:10

Identifiants

  • HAL Id : ujm-00870693, version 1

Citation

Christine Largeron, Peng Wang, Véronique Eglin, Antony Mckenna, Christophe Garcia. A comprehensive representation model for handwriting dedicated to word spotting. International Conference on Document Analysis and Recognition (ICDAR), Aug 2013, Washington, United States. 〈ujm-00870693〉

Partager

Métriques

Consultations de la notice

277