A Nearest Neighbor Algorithm for Imbalanced Classification - Archive ouverte HAL Access content directly
Journal Articles International Journal on Artificial Intelligence Tools Year : 2021

A Nearest Neighbor Algorithm for Imbalanced Classification

(1, 2) , (2) , (2) , (3) , (1) , (2)
1
2
3

Abstract

Due to the inability of the accuracy-driven methods to address the challenging problem of learning from imbalanced data, several alternative measures have been proposed in the literature, like the Area Under the ROC Curve (AUC), the Average Precision (AP), the F-measure, the G-Mean, etc. However, these latter measures are neither smooth, convex nor separable, making their direct optimization hard in practice. In this paper, we tackle the challenging problem of imbalanced learning from a nearest-neighbor (NN) classification perspective, where the minority examples typically belong to the class of interest. Based on simple geometrical ideas, we introduce an algorithm that rescales the distance between a query sample and any positive training example. This leads to a modification of the Voronoi regions and thus of the decision boundaries of the NN classifier. We provide a theoretical justification about this scaling scheme which inherently aims at reducing the False Negative rate while controlling the number of False Positives. We further formally establish a link between the proposed method and cost-sensitive learning. An extensive experimental study is conducted on many public imbalanced datasets showing that our method is very effective with respect to popular Nearest-Neighbor algorithms, comparable to state-of-the-art sampling methods and even yields the best performance when combined with them.
Fichier principal
Vignette du fichier
ijait.pdf (3.99 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

ujm-03282293 , version 1 (24-09-2021)

Identifiers

Cite

Rémi Viola, Rémi Emonet, Amaury Habrard, Guillaume Metzler, Sébastien Riou, et al.. A Nearest Neighbor Algorithm for Imbalanced Classification. International Journal on Artificial Intelligence Tools, 2021, 30, ⟨10.1142/s0218213021500135⟩. ⟨ujm-03282293⟩
56 View
70 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More