Learning Balls of Strings from Edit Corrections - Université Jean-Monnet-Saint-Étienne Accéder directement au contenu
Article Dans Une Revue Journal of Machine Learning Research Année : 2008

Learning Balls of Strings from Edit Corrections

Résumé

When facing the question of learning languages in realistic settings, one has to tackle several problems that do not admit simple solutions. On the one hand, languages are usually defined by complex grammatical mechanisms for which the learning results are predominantly negative, as the few algorithms are not really able to cope with noise. On the other hand, the learning settings themselves rely either on too simple information (text) or on unattainable one (query systems that do not exist in practice, nor can be simulated). We consider simple but sound classes of languages defined via the widely used edit distance: the balls of strings. We propose to learn them with the help of a new sort of queries, called the correction queries: when a string is submitted to the Oracle, either she accepts it if it belongs to the target language, or she proposes a correction, that is, a string of the language close to the query with respect to the edit distance. We show that even if the good balls are not learnable in Angluin's MAT model, they can be learned from a polynomial number of correction queries. Moreover, experimental evidence simulating a human Expert shows that this algorithm is resistant to approximate answers.
Fichier non déposé

Dates et versions

ujm-00314590 , version 1 (27-08-2008)

Identifiants

  • HAL Id : ujm-00314590 , version 1

Citer

Leonor Becerra Bonache, Colin de La Higuera, Jean-Christophe Janodet, Frédéric Tantini. Learning Balls of Strings from Edit Corrections. Journal of Machine Learning Research, 2008, 9, pp.1823-1852. ⟨ujm-00314590⟩
301 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More