A. Paz, Introduction to Probabilistic Automata, 1971.

L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recoginition, Proceedings of the IEEE, pp.257-286, 1989.

F. Jelinek, Statistical Methods for Speech Recognition, 1998.

R. Carrasco and J. Oncina, Learning stochastic regular grammars by means of a state merging method, " ser, Lecture Notes in Computer Science, issue.862, pp.139-150, 1994.

L. Saul and F. Pereira, Aggregate and mixed-order Markov models for statistical language processing, Proceedings of the Second Conference on Empirical Methods in Natural Language Processing

H. Ney, S. Martin, F. Wessel, S. Young, and G. Bloothooft, Corpus-Based Statiscal Methods in Speech and Language Processing, ch. Statistical Language Modeling Using Leaving-One-Out, pp.174-207, 1997.

D. Ron, Y. Singer, and N. Tishby, Learning probabilistic automata with variable memory length, Proceedings of the seventh annual conference on Computational learning theory , COLT '94, pp.35-46, 1994.
DOI : 10.1145/180139.181006

M. Mohri, Finite-state transducers in language and speech processing, Computational Linguistics, vol.23, issue.3, pp.269-311, 1997.

K. S. Fu, Syntactic pattern recognition and applications, 1982.
DOI : 10.1007/978-3-642-66438-0

L. Miclet, Structural Methods in Pattern Recognition, 1987.

S. Lucas, E. Vidal, A. Amari, S. Hanlon, and J. C. , A comparison of syntactic and statistical techniques for off-line OCR, " ser, Lecture Notes in Computer Science, issue.862, pp.168-179, 1994.

D. Ron, Y. Singer, and N. Tishby, On the learnability and usage of acyclic probabilistic finite automata, Proceedings of COLT 1995, pp.31-40, 1995.

H. Ney, Stochastic Grammars and Pattern Recognition, Proceedings of the NATO Advanced Study Institute, pp.313-344, 1992.
DOI : 10.1007/978-3-642-76626-8_34

N. Abe and H. Mamitsuka, Predicting protein secondary structure using stochastic tree grammars, Machine Learning, pp.275-301, 1997.

Y. Sakakibara, M. Brown, R. Hughley, I. Mian, K. Sjolander et al., Stochastic context-free grammers for tRNA modeling, Nucleic Acids Research, vol.22, issue.23, pp.5112-5120, 1994.
DOI : 10.1093/nar/22.23.5112

R. B. Lyngsø, C. N. Pedersen, and H. Nielsen, Metrics and similarity measures for hidden Markov models, Proceedings of ISMB'99, 1999.

R. B. Lyngsø and C. N. Pedersen, Complexity of Comparing Hidden Markov Models, Proceedings of ISAAC '01, 2001.
DOI : 10.1006/jmbi.1994.1104

P. Cruz and E. Vidal, Learning regular grammars to model musical style: Comparing different coding schemes, " ser, Lecture Notes in Computer Science, issue.1433, pp.211-222, 1998.

M. G. Thomason, Regular stochastic syntax-directed translations, 1976.

M. Mohri, F. Pereira, and M. Riley, The design principles of a weighted finite-state transducer library, Theoretical Computer Science, vol.231, issue.1, pp.17-32, 2000.
DOI : 10.1016/S0304-3975(99)00014-6

H. Alshawi, S. Bangalore, and S. Douglas, Learning Dependency Translation Models as Collections of Finite-State Head Transducers, Computational Linguistics, vol.23, issue.3, 2000.
DOI : 10.1016/S0019-9958(67)80007-X

J. C. Amengual, J. M. Benedí, F. Casacuberta, A. C. No, A. Castellanos et al., The EUTRANS-I speech translation system, Machine Translation, vol.15, issue.1/2, pp.75-103, 2000.
DOI : 10.1023/A:1011116115948

S. Bangalore and G. Riccardi, Stochastic finite-state models for spoken language machine translation, Proceedings of the Workshop on Embeded Machine Translation Systems, pp.52-59, 2000.
DOI : 10.3115/1117586.1117594

F. Casacuberta, H. Ney, F. J. Och, E. Vidal, J. M. Vilar et al., Some approaches to statistical and finite-state speech-to-speech translation, Computer Speech and Language, 2003.
DOI : 10.1016/S0885-2308(03)00028-7

L. Bréhélin, O. Gascuel, and G. Caraux, Hidden Markov models with patterns to learn Boolean vector sequences and applications to the built-in self-test for integrated circuits, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.23, issue.9, pp.997-1008, 2001.
DOI : 10.1109/34.955112

Y. Bengio, V. Lauzon, and R. Ducharme, Experiments on the application of IOHMMs to model financial returns series, IEEE Transactions on Neural Networks, vol.12, issue.1, pp.113-123, 2001.
DOI : 10.1109/72.896800

K. S. Fu, Syntactic Methods in Pattern Recognition, 1974.

J. J. Paradaens, Eine allgemeine Definition von stochastischen Automaten, Computing, vol.1, issue.2, pp.93-105, 1974.
DOI : 10.1007/BF02246610

K. S. Fu and T. L. Booth, Grammatical inference: Introduction and survey. part I and II, IEEE Transactions on System Man and Cybernetics, vol.5, pp.59-72, 1975.
DOI : 10.1109/tpami.1986.4767796

C. S. Wetherell and F. Casacuberta, Probabilistic languages : A review and some open questions Computing Surveys Some relations among stochastic finite state networks used in automatic speech recogntion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.12, issue.12 7, pp.691-695, 1990.

D. Angluin, Identifying languages from stochastic examples, 1988.

M. Kearns and L. Valiant, Cryptographic limitations on learning boolean formulae and finite automata, 21st ACM Symposium on Theory of Computing, pp.433-444, 1989.
DOI : 10.1007/3-540-56483-7_21

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.5438

M. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R. E. Schapire et al., On the learnability of discrete distributions, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing , STOC '94, pp.273-282, 1994.
DOI : 10.1145/195058.195155

M. Kearns and U. Vazirani, An Introduction to Computational Learning Theory, 1994.

N. Abe and M. Warmuth, On the computational complexity of approximating distributions by probabilistic automata, Proceedings of the Third Workshop on Computational Learning Theory, pp.52-66, 1998.
DOI : 10.1007/BF00992677

P. Dupont, F. Denis, and Y. Esposito, Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms, Pattern Recognition, vol.38, issue.9, 2004.
DOI : 10.1016/j.patcog.2004.03.020

E. Vidal, F. Thollard, C. De-la-higuera, F. Casacuberta, and R. C. Carrasco, Probabilistic finite state automata ? part II, Special Issue-Syntactic and Structural Pattern Recognition, 2004.
DOI : 10.1109/tpami.2005.148

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.4400

M. O. Rabin, Probabilistic automata, Information and Control, vol.6, issue.3, pp.230-245, 1963.
DOI : 10.1016/S0019-9958(63)90290-0

URL : http://doi.org/10.1016/s0019-9958(63)90290-0

G. D. Forney, The viterbi algorithm, IEEE Procedings, pp.268-278, 1973.
DOI : 10.1109/PROC.1973.9030

F. Casacuberta, C. De, and . Higuera, Computational Complexity of Problems on Probabilistic Grammars and Transducers, Lecture Notes in Computer Science, vol.1891, pp.15-24, 2000.
DOI : 10.1007/978-3-540-45257-7_2

R. C. Carrasco, Accurate computation of the relative entropy between stochastic regular grammars, RAIRO (Theoretical Informatics and Applications), pp.437-444, 1997.
DOI : 10.1051/ita/1997310504371

W. Tzeng, A Polynomial-Time Algorithm for the Equivalence of Probabilistic Automata, SIAM Journal on Computing, vol.21, issue.2, pp.216-227, 1992.
DOI : 10.1137/0221017

A. Fred, Computation of Substring Probabilities in Stochastic Grammars, Lecture Notes in Computer Science, vol.1891, pp.103-114, 2000.
DOI : 10.1007/978-3-540-45257-7_9

M. Young-lai and F. W. Tompa, Stochastic grammatical inference of text database structure, Machine Learning, pp.111-137, 2000.

D. Ron and R. Rubinfeld, Learning fallible Deterministic Finite Automata, Machine Learning, pp.149-185, 1995.
DOI : 10.1007/BF00993409

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.6791

C. Cook and A. Rosenfeld, Some Experiments in Grammatical Inference, NATO ASI on Computer Orientation Learning Process, pp.157-171, 1974.
DOI : 10.1007/978-94-010-1545-5_6

K. Knill and S. Young, Corpus-Based Statistical Methods in Speech and Language Processing, ch. Hidden Markov Models in Speech and Language Processing, pp.27-68, 1997.

N. Merhav and Y. Ephraim, Hidden Markov modeling using a dominant state sequence with application to speech recognition, Computer Speech & Language, vol.5, issue.4, pp.327-339, 1991.
DOI : 10.1016/0885-2308(91)90002-8

R. G. Galleguer, Discrete Stochastic Processes, Journal of the Operational Research Society, vol.48, issue.1, 1996.
DOI : 10.1057/palgrave.jors.2600329

V. C. Blondel, Undecidable Problems for Probabilistic Automata of Fixed Dimension, Theory of Computing Systems, pp.231-245, 2003.
DOI : 10.1007/s00224-003-1061-2

M. H. Harrison, Introduction to Formal Language Theory, 1978.

C. De and . Higuera, Characteristic sets for polynomial grammatical inference, Machine Learning, pp.125-138, 1997.

R. Carrasco and J. Oncina, Learning deterministic regular grammars from stochastic samples in polynomial time, RAIRO (Theoretical Informatics and Applications), pp.1-20, 1999.
DOI : 10.1051/ita:1999102

C. De and . Higuera, Why -transitions are not necessary in probabilistic finite automata, EURISE, 2003.

T. Cover and J. Thomas, Elements of Information Theory, 1991.

J. Goodman, A bit of progress in language modeling, Computer Speech & Language, vol.15, issue.4, 2001.
DOI : 10.1006/csla.2001.0174

R. Kneser and H. Ney, Improved clustering techniques for class-based language modelling, European Conference on Speech Communication and Technology, pp.973-976, 1993.

P. Brown, V. Della-pietra, P. De-souza, J. Lai, and R. Mercer, Class-based N-gram models of natural language, Computational Linguistics, vol.18, issue.4, pp.467-479, 1992.

A. De-oliveira, Grammatical Inference: Algorithms and Applications, ICGI '00, ser, Lecture Notes in Computer Science, vol.1891, 2000.