Skip to Main content Skip to Navigation
Journal articles

Learning Balls of Strings from Edit Corrections

Abstract : When facing the question of learning languages in realistic settings, one has to tackle several problems that do not admit simple solutions. On the one hand, languages are usually defined by complex grammatical mechanisms for which the learning results are predominantly negative, as the few algorithms are not really able to cope with noise. On the other hand, the learning settings themselves rely either on too simple information (text) or on unattainable one (query systems that do not exist in practice, nor can be simulated). We consider simple but sound classes of languages defined via the widely used edit distance: the balls of strings. We propose to learn them with the help of a new sort of queries, called the correction queries: when a string is submitted to the Oracle, either she accepts it if it belongs to the target language, or she proposes a correction, that is, a string of the language close to the query with respect to the edit distance. We show that even if the good balls are not learnable in Angluin's MAT model, they can be learned from a polynomial number of correction queries. Moreover, experimental evidence simulating a human Expert shows that this algorithm is resistant to approximate answers.
Document type :
Journal articles
Complete list of metadata
Contributor : Frédéric Tantini Connect in order to contact the contributor
Submitted on : Wednesday, August 27, 2008 - 4:06:42 PM
Last modification on : Saturday, June 25, 2022 - 7:25:19 PM


  • HAL Id : ujm-00314590, version 1



Leonor Becerra Bonache, Colin de La Higuera, Jean-Christophe Janodet, Frédéric Tantini. Learning Balls of Strings from Edit Corrections. Journal of Machine Learning Research, Microtome Publishing, 2008, 9, pp.1823-1852. ⟨ujm-00314590⟩



Record views