Position models and language modeling

Arnaud Zdziobeck; Franck Thollard

Communication Dans Un Congrès Année : 2008

Position models and language modeling

(1) , (1)

Arnaud Zdziobeck

Fonction : Auteur

Laboratoire Hubert Curien

Franck Thollard

Fonction : Auteur
PersonId : 841732

Laboratoire Hubert Curien

Résumé

In statistical language modelling the classic model used is $n$-gram. This model is not able however to capture long term dependencies, \emph{i.e.} dependencies larger than $n$. An alternative to this model is the probabilistic automaton. Unfortunately, it appears that preliminary experiments on the use of this model in language modelling is not yet competitive, partly because it tries to model too long term dependencies. We propose here to improve the use of this model by restricting the dependency to a more reasonable value. Experiments shows an improvement of 45\% reduction in the perplexity obtained on the Wall Street Journal language modeling task.

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

zdiziobeck_thollard_sspr.pdf (312.88 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Franck Thollard : Connectez-vous pour contacter le contributeur

https://ujm.hal.science/ujm-00322820

Soumis le : lundi 9 mars 2009-12:01:04

Dernière modification le : vendredi 24 mars 2023-14:52:51

Archivage à long terme le : vendredi 4 juin 2010-11:35:07

Dates et versions

ujm-00322820 , version 1 (09-03-2009)

Identifiants

HAL Id : ujm-00322820 , version 1

Citer

Arnaud Zdziobeck, Franck Thollard. Position models and language modeling. Structural and Syntactic Pattern Recognition and Statistical Techniques in Pattern Recognition, Dec 2008, Orlando, United States. pp.76-85. ⟨ujm-00322820⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ST-ETIENNE IOGS CNRS LAHC PARISTECH UDL ANR

59 Consultations

126 Téléchargements

Position models and language modeling

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager