Mixed-Integer Constrained Programming for Binary Classification, Feature Selection and Imbalanced Data
Abstract
Imbalanced binary classification is a common issue in serious contexts when rare events have practical effects. Traditional cost-sensitive strategies improve minority identification, but they frequently address feature selection and classification separately, consequently leading to suboptimal results. This paper presents MP-CSFS (Mathematical Programming for Cost-Sensitive Feature Selection), a mixed-integer linear programming framework that integrates cost-sensitive classification and feature selection into a single optimization model. The approach use asymmetric misclassification costs and applies binary activation variables to manage feature selection. This mixed optimization technique maintains that specified features enhance differentiation between classes under the cost limits, enhancing both clarity and prediction stability. The proposed method is evaluated on large-scalennumerous high-dimensional benchmarks which display high class imbalance. Experimental results provide average or above-average outcomes against existing approaches, notably in metrics such as AUC and G-mean, while reducing dimensionality significantly and improving computational tractability. The outcomes show that blending feature selection with cost-sensitive learning inside a mathematical programming framework provides an intuitive as well as scalable solution to imbalanced classification in difficult, real-world scenarios.
Keywords
Citation Information
@article{redouanehakimi2026,
title={Mixed-Integer Constrained Programming for Binary Classification, Feature Selection and Imbalanced Data},
author={Redouane Hakimi and Badreddine Benyacoub and Mohamed Ouzineb},
journal={Knowledge and Information Systems},
year={2026},
doi={https://doi.org/10.21203/rs.3.rs-9199348/v1}
}
SinoXiv