Correction of Uniformly Noisy Distributions to Improve Probabilistic Grammatical Inference Algorithms

Abstract : In this paper, we aim at correcting distributions of noisy samples in order to improve the inference of probabilistic automata. Rather than definitively removing corrupted examples before the learning process, we propose a technique, based on statisticalestimates and linear regression, for correcting the probabilistic prefix tree automaton (PPTA). It requires a human expertise to correct only a small sample of data, selected in order to estimate the noise level. This statistical information permits us to automatically correct the whole PPTA and then to infer better models from a generalization point of view. After a theoretical analysis of the noise impact, we present a large experimental study on several datasets.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal-ujm.archives-ouvertes.fr/ujm-00378062
Contributor : Marc Bernard <>
Submitted on : Thursday, April 23, 2009 - 2:54:45 PM
Last modification on : Wednesday, July 25, 2018 - 2:05:31 PM
Long-term archiving on : Thursday, June 10, 2010 - 9:41:01 PM

File

hbs_flairs05_draft.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : ujm-00378062, version 1

Citation

Amaury Habrard, Marc Bernard, Marc Sebban. Correction of Uniformly Noisy Distributions to Improve Probabilistic Grammatical Inference Algorithms. 18th International Florida Artificial Intelligence Research Society conference, May 2005, United States. pp.493-498. ⟨ujm-00378062⟩

Share

Metrics

Record views

139

Files downloads

172