Feature Scaling as a Determinant of Machine Learning Performance: An Empirical Evaluation of Standardization, Min-Max Normalization, and Robust Scaling on Biomedical Classification Tasks

Authors:

P. Srivyshnavi,Peddapalegani Palavardhan, Satish Kannuru

Page No: 285 - 297

Abstract:

Data preprocessing constitutes an often-underestimated but foundational stage in the machine learning pipeline, exerting influence on both the convergence behaviour of optimisation algorithms and the representational fidelity of learned decision boundaries. Among preprocessing operations, feature scaling — the transformation of input variable magnitudes to a common numerical range or distributional form — is particularly consequential for algorithms that rely on Euclidean distance computations or gradient-based parameter updates. This study presents a systematic empirical evaluation of three widely deployed scaling techniques — Z-score standardisation, Min-Max normalisation, and Robust Scaling — and their differential impact on three classifiers of distinct computational architectures: K-Nearest Neighbours (KNN), Logistic Regression, and Support Vector Machines with a Radial Basis Function kernel. Experiments were conducted on the Breast Cancer Wisconsin (Diagnostic) Dataset, a benchmark biomedical classification problem comprising 569 patient records and 30 continuous numerical features characterised by pronounced inter-feature magnitude disparity, non-Gaussian distributional profiles, and appreciable outlier contamination. Four evaluation metrics — accuracy, precision, recall, and F1-score — were computed on a held-out test partition, supplemented by five-fold stratified cross-validation to assess generalisation stability. The principal finding is that feature scaling exerts qualitatively different effects across classifier architectures. For the SVM classifier, the absence of scaling produced a degenerate model that classified virtually all instances as malignant (F1 = 0.547; precision = 0.377), while the application of Z-score standardisation recovered full classifier functionality (F1 = 0.971), a differential of 0.424 F1 points attributable solely to one preprocessing decision. KNN exhibited a 4.4 percentage-point accuracy gain under standardisation, accompanied by a halving of cross-validation variance. Logistic regression, while theoretically scale-invariant in its decision boundary, benefited from substantially improved convergence stability. Across all three classifiers, Z-score standardisation yielded the highest performance, followed by Robust Scaling and Min-Max normalisation. The relative underperformance of Min-Max is attributed to the dataset's high outlier prevalence, which distorts extremum-anchored transformations. These findings have direct implications for preprocessing protocol design in clinical machine learning and biomedical informatics applications.

Description:

.

Volume & Issue

Volume-15,ISSUE-3

Keywords

.