F. Casacuberta, Statistical estimation of stochastic context-free grammars, Pattern Recognition Letters, vol.16, issue.6, pp.565-573, 1995.
DOI : 10.1016/0167-8655(95)80002-B

G. J. Mclachlan and T. Krishnan, The EM Algorithm and Extensions, 1997.

D. Picó and F. Casacuberta, Some statistical-estimation methods for stochastic finite-state transducers, Machine Learning, vol.44, issue.1/2, pp.121-141, 2001.
DOI : 10.1023/A:1010880113956

P. Dupont, F. Denis, and Y. Esposito, Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms, Pattern Recognition, vol.38, issue.9, 2004.
DOI : 10.1016/j.patcog.2004.03.020

I. H. Witten and T. C. Bell, The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression, IEEE Transactions on Information Theory, vol.37, issue.4, pp.1085-1094, 1991.
DOI : 10.1109/18.87000

H. Ney, S. Martin, F. Wessel, S. Young, and G. Bloothooft, Corpus-Based Statiscal Methods in Speech and Language Processing, ch. Statistical Language Modeling Using Leaving-One-Out, pp.174-207, 1997.

P. Dupont and J. Amengual, Smoothing Probabilistic Automata: An Error-Correcting Approach, Lecture Notes in Computer Science, vol.1891, pp.51-57, 2000.
DOI : 10.1007/978-3-540-45257-7_5
URL : http://biblio.info.ucl.ac.be/2000/272756.pdf

Y. Sakakibara, M. Brown, R. Hughley, I. Mian, K. Sjolander et al., Stochastic context-free grammers for tRNA modeling, Nucleic Acids Research, vol.22, issue.23, pp.5112-5120, 1994.
DOI : 10.1093/nar/22.23.5112

T. Kammeyer and R. K. Belew, Stochastic context-free grammar induction with a genetic algorithm using local search, Foundations of Genetic Algorithms, 1996.

A. and H. Mamitsuka, Predicting protein secondary structure using stochastic tree grammars, Machine Learning, pp.275-301, 1997.

R. C. Carrasco, J. Oncina, and J. Calera-rubio, Stochastic inference of regular tree languages, Machine Learning Journal, vol.44, issue.1, pp.185-197, 2001.
DOI : 10.1007/BFb0054075

M. Kearns and L. Valiant, Cryptographic limitations on learning boolean formulae and finite automata, 21st ACM Symposium on Theory of Computing, pp.433-444, 1989.

M. Abe and . Warmuth, On the computational complexity of approximating distributions by probabilistic automata, Machine Learning, pp.205-260, 1992.
DOI : 10.1007/BF00992677

M. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R. E. Schapire et al., On the learnability of discrete distributions, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing , STOC '94, pp.273-282, 1994.
DOI : 10.1145/195058.195155

D. Ron, Y. Singer, and N. Tishby, On the learnability and usage of acyclic probabilistic finite automata, Proceedings of COLT 1995, pp.31-40, 1995.

A. Stolcke and S. Omohundro, Inducing probabilistic grammars by bayesian model merging, " ser, Lecture Notes in Computer Science, issue.862, pp.106-118, 1994.
DOI : 10.1007/3-540-58473-0_141
URL : http://arxiv.org/abs/cmp-lg/9409010

F. Jelinek, Statistical Methods for Speech Recognition, 1998.

P. García and E. Vidal, Inference of k-testable languages in the strict sense and application to syntactic pattern recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.12, issue.9, pp.920-925, 1990.
DOI : 10.1109/34.57687

Y. Zalcstein, Locally testable languages, Journal of Computer and System Sciences, vol.6, issue.2, pp.151-167, 1972.
DOI : 10.1016/S0022-0000(72)80020-5

R. Mcnaughton, Algebraic decision procedures for local testability, Mathematical Systems Theory, vol.5, issue.1, pp.60-67, 1974.
DOI : 10.1007/BF01761708

E. Vidal and D. Llorens, Using knowledge to improve N-Gram language modelling through the MGGI methodology, Proceedings of ICGI '96, ser. Lecture Notes in Computer Science, p.1147, 1996.
DOI : 10.1007/BFb0033353

L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recoginition, Proceedings of the IEEE, pp.257-286, 1989.

J. Picone, Continuous speech recognition using hidden Markov models, IEEE ASSP Magazine, vol.7, issue.3, pp.26-41, 1990.
DOI : 10.1109/53.54527

I. Bazzi, R. Schwartz, and J. Makhoul, An omnifont open-vocabulary OCR system for English and Arabic, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.21, issue.6, pp.495-504, 1999.
DOI : 10.1109/34.771314

A. Toselli, A. Juan, D. Keysers, J. González, I. Salvador et al., INTEGRATED HANDWRITING RECOGNITION AND INTERPRETATION USING FINITE-STATE MODELS, International Journal of Pattern Recognition and Artificial Intelligence, vol.18, issue.04, 2004.
DOI : 10.1142/S0218001404003344

F. Casacuberta, Finite-state transducers for speech-input translation, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01., 2001.
DOI : 10.1109/ASRU.2001.1034664

F. Casacuberta, E. Vidal, and J. M. Vilar, Architectures for speech-to-speech translation using finite-state models, Proceedings of the ACL-02 workshop on Speech-to-speech translation: algorithms and systems -, pp.39-44, 2002.
DOI : 10.3115/1118656.1118662

A. Molina and F. Pla, Shallow parsing using specialized HMMs, Journal on Machine Learning Research, vol.2, pp.559-594, 2002.

H. Bunke and T. Caelli, Hidden Markov Models applications in Computer Vision, ser. Series in Machine Perception and Artificial Intelligence, World Scientific, vol.45, 2001.

R. Llobet, A. H. Toselli, J. C. Perez-cortes, and A. Juan, Computer-Aided Prostate Cancer Detection in Ultrasonographic Images, Proceedings of the 1st Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), pp.411-419, 2003.
DOI : 10.1007/978-3-540-44871-6_48

Y. Bengio, V. Lauzon, and R. Ducharme, Experiments on the application of IOHMMs to model financial returns series, IEEE Transactions on Neural Networks, vol.12, issue.1, pp.113-123, 2001.
DOI : 10.1109/72.896800

F. Casacuberta, Some relations among stochastic finite state networks used in automatic speech recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.12, issue.7, pp.691-695, 1990.
DOI : 10.1109/34.56212

J. Goodman, A bit of progress in language modeling, Computer Speech & Language, vol.15, issue.4, 2001.
DOI : 10.1006/csla.2001.0174

D. Mcallester and R. E. Schapire, On the convergence rate of Good-Turing estimators, Proc. 13th Annu. Conference on Comput. Learning Theory, pp.1-6, 2000.

M. Mohri, F. Pereira, and M. Riley, The design principles of a weighted finite-state transducer library, Theoretical Computer Science, vol.231, issue.1, pp.17-32, 2000.
DOI : 10.1016/S0304-3975(99)00014-6

R. Chaudhuri and S. Rao, Approximating grammar probabilities: solution of a conjecture, Journal of the ACM, vol.33, issue.4, pp.702-705, 1986.
DOI : 10.1145/6490.214099

C. S. Wetherell and F. Casacuberta, Probabilistic languages : A review and some open questions Computing Surveys Probabilistic estimation of stochastic regular syntax-directed translation schemes, VI Spanish Symposium on Pattern Recognition and Image Analysis, pp.201-297, 1995.

D. Picó and F. Casacuberta, A Statistical-Estimation Method for Stochastic Finite-State Transducers Based on Entropy Measures, Advances in Pattern Recognition, ser. LNCS, pp.417-426, 2000.
DOI : 10.1007/3-540-44522-6_43

E. M. Gold, Language identification in the limit, Information and Control, vol.10, issue.5, pp.447-474, 1967.
DOI : 10.1016/S0019-9958(67)91165-5

L. G. Valiant, A theory of the learnable, Communications of the ACM, vol.27, issue.11, pp.1134-1142, 1984.
DOI : 10.1145/1968.1972

L. Pitt and M. Warmuth, The minimum consistent DFA problem cannot be approximated within any polynomial, Journal of the ACM, vol.40, issue.1, pp.95-142, 1993.
DOI : 10.1145/138027.138042

F. Denis, C. Halluin, and R. Gilleron, PAC learning with simple examples, 13th Symposium on Theoretical Aspects of Computer Science, STACS'96, ser, pp.231-242, 1996.
DOI : 10.1007/3-540-60922-9_20
URL : https://hal.archives-ouvertes.fr/inria-00538883

F. Denis and R. Gilleron, PAC learning under helpful distributions, Algorithmic Learning Theory, ALT'97, 1997.
DOI : 10.1007/3-540-63577-7_40
URL : https://hal.archives-ouvertes.fr/inria-00538884

R. Parekh and V. Honavar, Learning DFA from simple examples, Workshop on Automata Induction, Grammatical Inference, and Language Acquisition, ICML-97, 1997.
DOI : 10.1007/3-540-63577-7_39
URL : http://archives.cs.iastate.edu/documents/disk0/00/00/01/52/00000152-01/TR97-07.pdf

J. J. Horning, A procedure for grammatical inference, Information Processing, vol.71, pp.519-523, 1972.

D. Angluin, Identifying languages from stochastic examples, 1988.

S. Kapur and G. Bilardi, Language learning from stochastic input, Proceedings of the fifth annual workshop on Computational learning theory , COLT '92, pp.303-310, 1992.
DOI : 10.1145/130385.130419

N. Abe and M. Warmuth, On the computational complexity of approximating distributions by probabilistic automata, Proceedings of the Third Workshop on Computational Learning Theory, pp.52-66, 1998.
DOI : 10.1007/BF00992677

R. Carrasco and J. Oncina, Learning deterministic regular grammars from stochastic samples in polynomial time, RAIRO (Theoretical Informatics and Applications), pp.1-20, 1999.
DOI : 10.1051/ita:1999102

A. Clark and F. Thollard, Pac-learnability of probabilistic deterministic finite state automata, Journal of Machine Learning Research, vol.5, pp.473-497, 2004.

C. De-la-higuera and F. Thollard, Identification in the limit with probability one of stochastic deterministic finite automata, " ser, Lecture Notes in Computer Science, vol.1891, pp.15-24, 2000.

R. Carrasco and J. Oncina, Learning stochastic regular grammars by means of a state merging method, " ser, Lecture Notes in Computer Science, issue.862, pp.139-150, 1994.

F. Thollard, P. Dupont, C. De, and . Higuera, Probabilistic dfa inference using Kullback-Leibler divergence and minimality, Proc. 17th International Conf. on Machine Learning, pp.975-982, 2000.

F. Thollard and A. Clark, Shallow Parsing Using Probabilistic Grammatical Inference, Int. Coll. on Grammatical Inference, M. v. Z. P. Adriaans, H. Fernau, vol.2484, pp.269-282, 2002.
DOI : 10.1007/3-540-45790-9_22
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.6906

C. Kermorvant and P. Dupont, Stochastic grammatical inference with multinomial tests, " ser. Lecture Notes in Computer Science, pp.149-160, 2002.

M. Young-lai and F. W. Tompa, Stochastic grammatical inference of text database structure, Machine Learning, pp.111-137, 2000.

P. García, E. Vidal, and F. Casacuberta, Local Languages, the Succesor Method, and a Step Towards a General Methodology for the Inference of Regular Grammars, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.9, issue.6, pp.841-845, 1987.
DOI : 10.1109/TPAMI.1987.4767991

A. Orlitsky, N. P. Santhanam, and J. Zhang, Always Good Turing: Asymptotically optimal probability estimation, 44th Annual IEEE Symposium on Foundations of Computer Science (FOCS'03), p.179, 2003.

S. Katz, Estimation of probabilities from sparse data for the language model component of a speech recognizer, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.35, issue.3, pp.400-401, 1987.
DOI : 10.1109/TASSP.1987.1165125

R. Kneser and H. Ney, Improved backing-off for M-gram language modeling, 1995 International Conference on Acoustics, Speech, and Signal Processing, pp.181-184, 1995.
DOI : 10.1109/ICASSP.1995.479394

S. F. Chen and J. Goodman, An empirical study of smoothing techniques for language modeling, Proceedings of the Thirty-Fourth Annual Meeting of the Association for Computational Linguistics, pp.310-318, 1996.

F. Thollard, Improving probabilistic grammatical inference core algorithms with post-processing techniques, Proc. 18th International Conf. on Machine Learning, pp.561-568, 2001.

D. Llorens, J. M. Vilar, and F. Casacuberta, FINITE STATE LANGUAGE MODELS SMOOTHED USING n-GRAMS, International Journal of Pattern Recognition and Artificial Intelligence, vol.16, issue.03, pp.275-289, 2002.
DOI : 10.1142/S0218001402001666

J. Amengual, A. Sanchis, E. Vidal, and J. Benedí, Language simplification through error-correcting and grammatical inference techniques, Machine Learning, pp.143-159, 2001.

P. Dupont and L. Chase, Using symbol clustering to improve probabilistic automaton inference, " ser, Lecture Notes in Computer Science, issue.1433, pp.232-243, 1998.

R. Kneser and H. Ney, Improved clustering techniques for class-based language modelling, European Conference on Speech Communication and Technology, pp.973-976, 1993.

C. Kermorvant, C. De, and . Higuera, Learning languages with help, " ser, Lecture Notes in Computer Science, vol.2484, 2002.

L. Breiman, Bagging predictors, Machine Learning, pp.123-140, 1996.
DOI : 10.1007/BF00058655

S. Bangalore and G. Riccardi, Stochastic finite-state models for spoken language machine translation, Proceedings of the Workshop on Embeded Machine Translation Systems, pp.52-59, 2000.

J. Oncina, P. García, and E. Vidal, Learning subsequential transducers for pattern recognition interpretation tasks, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.15, issue.5, pp.448-458, 1993.
DOI : 10.1109/34.211465

J. M. Vilar, Improve the learning of subsequential transducers by using alignments and dictionaries, " in Grammatical Inference: Algorithms and Applications, ser. Lecture Notes in Artificial Intelligence, pp.298-312, 2000.

F. Casacuberta, Inference of finite-state transducers by using regular grammars and morphisms, " in Grammatical Inference: Algorithms and Applications (proc. of ICGI-2000), ser. Lecture Notes in Artificial Intelligence, pp.1-14, 2000.

F. Casacuberta, H. Ney, F. J. Och, E. Vidal, J. M. Vilar et al., Some approaches to statistical and finite-state speech-to-speech translation, Computer Speech and Language, 2003.
DOI : 10.1016/S0885-2308(03)00028-7

F. Casacuberta and E. Vidal, Machine Translation with Inferred Stochastic Finite-State Transducers, Computational Linguistics, vol.23, issue.2, pp.205-225, 2004.
DOI : 10.1109/34.211465
URL : http://acl.ldc.upenn.edu/J/J04/J04-2004.pdf

M. Mohri, Finite-state transducers in language and speech processing, Computational Linguistics, vol.23, issue.3, pp.269-311, 1997.

M. Mohri, F. Pereira, and M. Riley, Weighted finite-state transducers in speech recognition, Computer Speech & Language, vol.16, issue.1, pp.69-88, 2002.
DOI : 10.1006/csla.2001.0184

H. Alshawi, S. Bangalore, and S. Douglas, Head transducer model for speech translation and their automatic acquisition from bilingual data, Machine Translation, 2000.

F. Casacuberta, C. De, and . Higuera, Computational Complexity of Problems on Probabilistic Grammars and Transducers, Lecture Notes in Computer Science, vol.1891, pp.15-24, 2000.
DOI : 10.1007/978-3-540-45257-7_2

F. Casacuberta, E. Vidal, and D. Picó, Inference of finite-state transducers from regular languages, press, 2004.
DOI : 10.1016/j.patcog.2004.03.025

E. Mäkinen, Inferring finite transducers, Journal of the Brazilian Computer Society, vol.9, issue.1, 1999.
DOI : 10.1590/S0104-65002003000200001

E. Vidal, P. García, and E. Segarra, INDUCTIVE LEARNING OF FINITE-STATE TRANSDUCERS FOR THE INTERPRETATION OF UNIDIMENSIONAL OBJECTS, Structural Pattern Analysis, pp.17-35, 1989.
DOI : 10.1142/9789814368292_0002

K. Knight and Y. , Translation with Finite-State Devices, Proceedings of the 4th. ANSTA Conference, ser. Lecture Notes in Artificial Intelligence, pp.421-437, 1998.
DOI : 10.1007/3-540-49478-2_38

J. Eisner, Parameter estimation for probabilistic finite-state transducers, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics , ACL '02, 2002.
DOI : 10.3115/1073083.1073085

D. Llorens, Suavizado de autómatas y traductores finitos estocásticos, 2000.

M. Nederhoff, Practical Experiments with Regular Approximation of Context-Free Languages, Computational Linguistics, vol.26, issue.1, 2000.
DOI : 10.1145/355598.362773

M. Mohri and M. Nederhof, Robustness in Language and Speech Technology Regular Approximations of Context-Free Grammars through Transformations, pp.252-261, 2000.

K. Lari and S. Young, The estimation of stochastic context-free grammars using the Inside-Outside algorithm, Computer Speech & Language, vol.4, issue.1, pp.35-56, 1990.
DOI : 10.1016/0885-2308(90)90022-X

J. Sánchez and J. Benedí, Consistency of stochastic context-free grammars from probabilistic estimation based on growth transformations, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19, issue.9, pp.1052-1055, 1997.
DOI : 10.1109/34.615455

J. Sánchez, J. Benedí, and F. Casacuberta, Comparison between the Inside-Outside algorithm and the Viterbi algorithm for stochastic context-free grammars, 8th Int. Workshop SSPR'96, pp.50-59, 1996.
DOI : 10.1007/3-540-61577-6_6

Y. Takada, Grammatical inference for even linear languages based on control sets, Information Processing Letters, vol.28, issue.4, pp.193-199, 1988.
DOI : 10.1016/0020-0190(88)90208-6

T. Koshiba, E. Mäkinen, and Y. Takada, Learning deterministic even linear languages from positive examples, Theoretical Computer Science, vol.185, issue.1, pp.63-79, 1997.
DOI : 10.1016/S0304-3975(97)00016-9

Y. Sakakibara, Learning context-free grammars from structural data in polynomial time, Theoretical Computer Science, vol.76, issue.2-3, pp.223-242, 1990.
DOI : 10.1016/0304-3975(90)90017-C

F. Maryanski and M. G. Thomason, Properties of stochastic syntax-directed translation schemata, International Journal of Computer & Information Sciences, vol.25, issue.9, pp.89-110, 1979.
DOI : 10.1007/BF00989665

A. Fred, Computation of Substring Probabilities in Stochastic Grammars, Lecture Notes in Computer Science, vol.1891, pp.103-114, 2000.
DOI : 10.1007/978-3-540-45257-7_9

V. Balasubramanian, Equivalence and reduction of Hidden Markov Models, Massachusetts Institute of Technology, 1993.

A. De-oliveira, Grammatical Inference: Algorithms and Applications, ICGI '00, ser, Lecture Notes in Computer Science, vol.1891, 2000.

P. Adriaans, H. Fernau, and M. Van-zaannen, Grammatical Inference: Algorithms and Applications, ICGI '00, ser, Lecture Notes in Computer Science, vol.2484, 2002.