MCut: A Thresholding Strategy for Multi-label Classification

Abstract : The multi-label classi cation is a frequent task in pattern recognition, data mining and machine learning. When binary classi ers are not suited, an alternative consists in using a multiclass classi er that provides for each document a score per category and then in applying a thresholding strategy in order to select the set of categories which must be assigned to the document. The common thresholding strategies, such as RCut, PCut and SCut methods, need a training step to determine the value of the threshold. To overcome this limit, we propose in this article a new strategy, called MCut which automatically estimates a value for the threshold. This method, simple to implement, does not have to be trained and it does not need any parametrization. Experimentations performed on two textual corpora: XML Mining 2009 and RCV1 collections, show that the MCut strategy obtains good results compared to those provided by usual thresholding strategies.
Type de document :
Communication dans un congrès
Eleventh International Symposium on Intelligent Data Analysis (IDA 2012), Oct 2012, Elsinki, Finland. pp.173-184, 2012
Liste complète des métadonnées

https://hal-ujm.archives-ouvertes.fr/ujm-00730656
Contributeur : Christine Largeron <>
Soumis le : lundi 10 septembre 2012 - 17:33:27
Dernière modification le : mercredi 25 juillet 2018 - 14:05:30

Identifiants

  • HAL Id : ujm-00730656, version 1

Collections

Citation

Christine Largeron, Mathias Géry, Christophe Moulin. MCut: A Thresholding Strategy for Multi-label Classification. Eleventh International Symposium on Intelligent Data Analysis (IDA 2012), Oct 2012, Elsinki, Finland. pp.173-184, 2012. 〈ujm-00730656〉

Partager

Métriques

Consultations de la notice

221