L. Denoyer and P. Gallinari, Report on the XML mining track at INEX 2007 categorization and clustering of XML documents, ACM SIGIR Forum, vol.42, issue.1, pp.22-28, 2008.
DOI : 10.1145/1394251.1394255

URL : https://hal.archives-ouvertes.fr/hal-01172481

G. Forman, I. Guyon, and A. Elisseeff, An extensive empirical study of feature selection metrics for text classification, Journal of Machine Learning Research, vol.3, pp.1289-1305, 2003.

T. Joachims, Text categorization with Support Vector Machines: Learning with many relevant features, European Conference on Machine Learning ECML98, pp.137-142, 1998.
DOI : 10.1007/BFb0026683

M. F. Porter, An algorithm for suffix stripping. Readings in information retrieval, pp.313-316, 1997.

G. Salton and M. J. Mcgill, Introduction to modern information retrieval, 1983.

C. E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal, vol.27, issue.3, pp.379-423, 1948.
DOI : 10.1002/j.1538-7305.1948.tb01338.x

V. Vapnik, The Nature of Statistical Learning Theory, 1995.

Y. Yang and J. Pedersen, A comparative study on feature selection in text categorization, Int. Conference on Machine Learning ICML97, pp.12-420, 1997.