Efficient Pruning of Probabilistic Automata

Abstract : Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning automata (when restricted to tree based structures) which is not only efficient (sub-quadratic) but that allows to dramatically reduce the size of the automaton with a small impact on the underlying distribution. Results are evaluated on a language modeling task.
Document type :
Conference papers
Complete list of metadatas

https://hal-ujm.archives-ouvertes.fr/ujm-00322818
Contributor : Franck Thollard <>
Submitted on : Monday, March 9, 2009 - 3:42:18 PM
Last modification on : Wednesday, July 25, 2018 - 2:05:30 PM
Long-term archiving on : Friday, June 4, 2010 - 11:35:06 AM

File

sspr08_pruning.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : ujm-00322818, version 1

Collections

Citation

Franck Thollard, Baptiste Jeudy. Efficient Pruning of Probabilistic Automata. Structural and Statistical Pattern Recognition, Dec 2008, Orlando, United States. pp.65-75. ⟨ujm-00322818⟩

Share

Metrics

Record views

230

Files downloads

126