Learning Balls of Strings from Edit Corrections - Archive ouverte HAL Access content directly
Journal Articles Journal of Machine Learning Research Year : 2008

Learning Balls of Strings from Edit Corrections

(1) , (2) , (2) , (2)
1
2

Abstract

When facing the question of learning languages in realistic settings, one has to tackle several problems that do not admit simple solutions. On the one hand, languages are usually defined by complex grammatical mechanisms for which the learning results are predominantly negative, as the few algorithms are not really able to cope with noise. On the other hand, the learning settings themselves rely either on too simple information (text) or on unattainable one (query systems that do not exist in practice, nor can be simulated). We consider simple but sound classes of languages defined via the widely used edit distance: the balls of strings. We propose to learn them with the help of a new sort of queries, called the correction queries: when a string is submitted to the Oracle, either she accepts it if it belongs to the target language, or she proposes a correction, that is, a string of the language close to the query with respect to the edit distance. We show that even if the good balls are not learnable in Angluin's MAT model, they can be learned from a polynomial number of correction queries. Moreover, experimental evidence simulating a human Expert shows that this algorithm is resistant to approximate answers.
Not file

Dates and versions

ujm-00314590 , version 1 (27-08-2008)

Identifiers

  • HAL Id : ujm-00314590 , version 1

Cite

Leonor Becerra Bonache, Colin de La Higuera, Jean-Christophe Janodet, Frédéric Tantini. Learning Balls of Strings from Edit Corrections. Journal of Machine Learning Research, 2008, 9, pp.1823-1852. ⟨ujm-00314590⟩
296 View
0 Download

Share

Gmail Facebook Twitter LinkedIn More