T A D Q I Q O T L A R
jahon ilmiy – metodik jurnali
https://scientific-jl.com
64-son_4-to’plam_Iyun-2025
83
ISSN:3030-3613
A VISUAL-EMPIRICAL STUDY OF SCALING EFFECTS AND HYPER-
PARAMETER ROBUSTNESS IN K-NEAREST NEIGHBOR
CLASSIFICATION
Tuxtabayev Qudratillo Axmadjanovich
(O‘zbekiston Milliy universiteti:
Xo’jayev Shukurjon Ahmedovich
(O‘zbekiston Milliy universiteti:
Annotation.
The paper revisits the
k-Nearest Neighbors
(k-NN) algorithm by
combining mathematical exposition with empirical testing on three benchmark
datasets—Iris, Wine and Breast-Cancer. All features were z-score standardized;
classification accuracy was recorded for k ranging from 1 to 15. Two visual tools—an
accuracy-versus-k curve and a 2-D PCA scatter plot—highlight how hyper-parameter
choice affects performance and reveal the inherent class structure. Findings confirm
that, with proper scaling and a moderate neighborhood size (k ≈ 5–11), k-NN attains
stable accuracies of roughly 94–96 %.
Key words:
k-nearest neighbors algorithm, z-score standardization, hyper-
parameter tuning, benchmark datasets (Iris, Wine, Breast-Cancer), PCA visualization,
classification accuracy.
Introduction.
The k-Nearest Neighbors (k-NN) algorithm belongs to the family
of instance-based (lazy-learning) methods that require virtually no explicit training
stage. Originally proposed by Cover and Hart [1], k-NN has gained wide popularity
over the past decade—particularly in engineering and experimental research—because
of its simplicity and intuitive appeal for both classification and regression across
datasets of varying size and dimensionality. One of its chief strengths is that model
construction (fitting) is almost trivial, so there is little need for elaborate hyper-
parameter tuning. Consequently, k-NN is often the first “ready-to-use” baseline for
rapid prototyping on large, heterogeneous data collections [2].
For every query instance, k-NN assigns a label (or numeric value) by consulting
the
k
closest reference samples and using majority vote (or a simple average).
Proximity is usually measured with the Euclidean distance, although Manhattan,
Minkowski, Mahalanobis, or specialized mixed-type metrics such as HVDM can be
employed. The hyper-parameter
𝑘
controls bias–variance trade-off: too small leads to
over-fitting local noise, whereas too large produces over-smoothed decision
boundaries and declining accuracy. Hence,
𝑘
is typically selected via cross-validation.
T A D Q I Q O T L A R
jahon ilmiy – metodik jurnali
https://scientific-jl.com
64-son_4-to’plam_Iyun-2025
84
ISSN:3030-3613
As the data size grows, the computational cost approaches
𝑂(𝑚𝑛)—
where
𝑚
is
the number of reference points and
𝑛
is the feature dimension. To mitigate this,
efficient data structures (KD-trees, ball-trees) and modern approximate-nearest-
neighbour libraries (e.g., FAISS, Annoy, HNSW) are widely used.
Like many machine-learning techniques, k-NN is inherently suited to numeric
features. When the feature space contains a mix of nominal and numeric variables, a
preliminary encoding step is essential:
✔
Label encoding:
maps each categorical value to an integer, but may
introduce spurious ordinal relationships.
✔
One-hot encoding:
creates a separate binary column per category, at the
expense of a sharp dimensionality increase and potential “distance
concentration.”
✔
Mixed-type metrics
(HVDM, Gower): integrate numeric and nominal
features directly, avoiding an explicit encoding step [3].
This study analyses the impact of
𝑘
selection, distance metric choice, and
encoding strategy on three benchmark datasets - Iris, Wine, and Breast-Cancer. All
features were standardized by z-score scaling, and classification accuracy was recorded
for
1 ≤ 𝑘 ≤ 9
. Two visual tools - an accuracy-vs-k curve and a 2-d PCA projection—
provide intuitive insight into parameter sensitivity and inherent class structure. The
results confirm that, with proper scaling and a moderate neighborhood size
(𝑘 ≈ 5 −
11)
, k-NN achieves stable accuracies of roughly
94– 96 %
.
Problem statement.
Pattern recognition is considered in its classical, two–class
formulation. Let
𝑆 = {𝐸
0
, 𝐸
1
, … , 𝐸
𝑚
}, 𝐸
𝑗
∈ {𝐾
1
, 𝐾
2
},
be a finite set of mutually exclusive objects that belong either to class
𝐾
1
or to class
𝐾
2
.
Each object is characterized by a vector of
𝑛
heterogeneous features, of which
●
𝜉
are quantitative (measured on an interval scale),
●
𝑛 − 𝜉
are nominal (unordered categories).
Denote
●
𝐼 ⊂ {1, … , 𝑛} −
the index set of quantitative features,
●
𝐼 ⊂ {1, … , 𝑛} −
the index set of nominal features, with
𝐼 ∪ 𝐽 =
{1, … , 𝑛} 𝑎𝑛𝑑 𝐼 ∩ 𝐽 = ∅
.
Required
1.
Feature–space unification
– transform the original mixed-type feature set into
a new representation in which
all
coordinates are comparable under a single
distance measure.
T A D Q I Q O T L A R
jahon ilmiy – metodik jurnali
https://scientific-jl.com
64-son_4-to’plam_Iyun-2025
85
ISSN:3030-3613
2.
Performance comparison
– compute the classification accuracy of the k-
Nearest Neighbors (k-NN) algorithm both
before
and
after
this transformation,
for a range of
𝑘
values.
Proposed transformation.
Following the strategy of Wilson & Martinez [5], the
mixed feature space is embedded into a metric space by combining
●
z-score scaling for every
𝑐 ∈ 𝐼
:
𝑧
(𝑐)
(𝐸) =
𝑥
(𝑐)
(𝐸) − 𝜇
(𝑐)
𝜎
(𝑐)
,
where
𝜇
(𝑐)
and
𝜎
(𝑐)
are the sample mean and standard deviation of the
𝑐 − 𝑡ℎ
quantitative features;
●
the Value-Difference Metric (VDM) for every
𝑐 ∈ 𝐽
:
𝑉𝐷𝑀 (𝑥
(𝑐)
(𝐸), 𝑥
(𝑐)
(𝐸
′
)) =
∑
𝑠∈{𝐾
1
,𝐾
2
}
|𝑃 (𝑥
(𝑐)
(𝐸)) − 𝑃(𝑠|𝑥
(𝑐)
(𝐸
′
))|,
where P(s|
𝑥
(𝑐)
) is the class-conditional relative frequency of category
𝑥
(𝑐)
.
The resulting heterogeneous distance between two objects
𝐸
and
𝐸
′
is
𝑑
𝐻𝑉𝐷𝑀
(𝐸, 𝐸
′
) =
√∑
𝑐∈𝐼
(𝑧
(𝑐)
(𝐸) − 𝑧
(𝑐)
(𝐸
′
))
2
+ ∑
𝑐∈𝐽
(𝑉𝐷𝑀(𝑥
(𝑐)
(𝐸), 𝑥
(𝑐)
(𝐸
′
)))
2
(1)
Because (1) is defined in a common
𝑅
𝑛
norm, every coordinate now resides on the
same measurement scale, satisfying Requirement 1.
Evaluation procedure
●
For each
𝑘 ∈ {1, 3, 5, … , 15}
the k-NN classifier is applied
1.
on the raw feature space (numeric features z-scaled, nominal features
label-encoded),
2.
on the unified space endowed with distance (1).
●
Ten–fold stratified cross-validation yields the accuracy estimates
𝐴𝑐𝑐
𝑟𝑎𝑤
(𝑘)
and
𝐴𝑐𝑐
𝐻𝑉𝐷𝑀
(𝑘)
.
●
Requirement 2 is fulfilled by reporting the pair
(𝐴𝑐𝑐
𝑟𝑎𝑤
(𝑘), 𝐴𝑐𝑐
𝐻𝑉𝐷𝑀
(𝑘))
for
every tested
𝑘
and highlighting the best settings.
Computational experiment.
For the present study we selected three well-known
benchmark datasets whose feature spaces are purely numerical. The key parameters of
the training samples are summarized in Table 1.
Table 1. List of training datasets
№
Dataset
name
Instances
Total
features
Nominal
Numeric
T A D Q I Q O T L A R
jahon ilmiy – metodik jurnali
https://scientific-jl.com
64-son_4-to’plam_Iyun-2025
86
ISSN:3030-3613
1
Iris
150
4
-
4
2
Wine
178
13
-
13
3
Breast
Cancer
569
30
-
30
The table 2 provided below demonstrates that the left half lists accuracies in the
untouched (unscaled) feature space, while the right half shows results after
𝑧 − 𝑠𝑐𝑜𝑟𝑒
normalisation. Scaling yields a marked improvement for the Wine and Breast-Cancer
datasets and a moderate gain for Iris at higher
𝑘
values.
Table 2. Accuracy of k-NN before and after z-score scaling
Traini
ng
dataset
Unscaled accuracy
(%)
z-score scaled accuracy (%)
k
1
3
5
7
9
1
3
5
7
9
Iris
93
.3
95
.6
97
.8
95
.6
95
.6
93.3
91.1
91.1
93.3
95.6
Wine
70
.4
68
.5
72
.2
74
.1
72
.2
96.3
94.4
94.4
94.4
96.3
Breast
Cancer
92
.4
91
.8
92
.4
93
.0
94
.2
95.9
95.3
95.9
96.5
96.5
We can observe in the table 2 that
𝑧 − 𝑠𝑐𝑜𝑟𝑒
scaling dramatically boosts k-NN
accuracy for Wine
(≈ +24 𝑝𝑝)
and produces a solid
3 − 𝑝𝑜𝑖𝑛𝑡
gain for Breast-
Cancer. Iris, already high in the raw space, benefits modestly at
𝑘 = 7
and
𝑘 = 9
.
Figure 1.
k-NN accuracy curves
(𝑘 = 1– 9)
for raw (solid) and
𝑧 − 𝑠𝑐𝑜𝑟𝑒
scaled (dashed) features on Iris, Wine, and Breast-Cancer datasets.
The three-panel figure traces how k-NN accuracy changes with the neighborhood
size
𝑘
under two preprocessing regimes—raw (“Unscaled”) and
𝑧 − 𝑠𝑐𝑜𝑟𝑒
standardized (“z-score”). In the Iris panel (left) both curves start in the low-to-mid
90 %
range, but the unscaled line rises sharply to almost
98 %
at
𝑘 = 5
before
levelling off, whereas the z-score line dips at
𝑘 = 3
and only regains
96 %
by
𝑘 =
9
. Because all four sepal- and petal-length features are already measured on
comparable centimetre scales, normalisation confers little benefit; performance is
T A D Q I Q O T L A R
jahon ilmiy – metodik jurnali
https://scientific-jl.com
64-son_4-to’plam_Iyun-2025
87
ISSN:3030-3613
dominated instead by the familiar bias–variance trade-off, with
𝑘 ≈ 5
marking the
sweet spot.
The situation is radically different for the Wine data (center). Here the unscaled
curve languishes below
75 %
across every
𝑘
, while the z-score curve hovers near
95 – 96 %
almost flat-lined. The thirteen wine-chemistry variables differ by orders of
magnitude (for example, alcohol percentage versus magnesium in parts per million),
so Euclidean distances are badly skewed unless each dimension is standardized. Once
the features are re-centered and re-scaled, the algorithm becomes virtually insensitive
to kk; even
𝑘 = 1
performs as well as
𝑘 = 9
.
The Breast-Cancer panel (right) lies between these extremes. With raw features
the curve starts around
92 %
, slips at
𝑘 = 3
, then recovers to
94 %
by
𝑘 = 9
. After
z-score scaling the baseline lifts immediately to about
96 %
, sags slightly, and peaks
near
96.5 %
at higher
𝑘
. The thirty tumour-morphology attributes are numeric but
heterogeneous enough that normalization yields a consistent two-to-four-point gain
and a smoother, more stable accuracy profile.
Taken together, the figure underscores a simple rule: the more disparate the native
feature scales, the larger the payoff from standardization. Scaling not only raises
absolute accuracy (dramatically so for Wine, modestly for Breast-Cancer, minimally
for Iris) but also flattens the accuracy-versus-kk curve, making the model less sensitive
to the precise choice of neighborhood size.
Pre-processing strategies evaluated on the Wine dataset.
For the Wine dataset
— which contains 13 physicochemical attributes such as alcohol percentage, flavonoid
concentration, and magnesium content — we evaluated how feature scaling influences
model performance by applying three distinct preprocessing pipelines.
By contrasting these three strategies on
identical
train-test splits we can
disentangle the effect of scale from the intrinsic predictive power of the features. In
preliminary experiments with k-Nearest Neighbors
(
𝑘 = 5
), both standardization and
Min–Max scaling reduced classification error by more than
40 %
relative to the raw
baseline—underscoring that, for distance-sensitive models, thoughtful preprocessing
is as crucial as hyper-parameter tuning.
Table 3. k-NN accuracy on Wine (
70 %
training, varying
𝑘
)
№
Pre-
processin
g
k = 1
3
5
7
9
1
Unscaled
70.4 %
68.5 %
72.2 %
74.1 %
72.2 %
2
z-score
96.3 %
94.4 %
94.4 %
94.4 %
96.3 %
3
Min–Max
96.3 %
94.4 %
96.3 %
94.4 %
92.6 %
Table 4. k-NN accuracy on Wine (
30 %
hold-out test, best
𝑘 = 1
)
T A D Q I Q O T L A R
jahon ilmiy – metodik jurnali
https://scientific-jl.com
64-son_4-to’plam_Iyun-2025
88
ISSN:3030-3613
№
Pre-
processing
Test
accuracy
1
Unscaled
70.4 %
2
z-score
96.3 %
Raw (Unscaled).
Leaving attributes in their original units lets high-magnitude
variables (e.g., magnesium) dominate Euclidean distance.
That imbalance is visible in
the
~70 %
accuracy band across all
𝑘
values—a full 25 percentage-point deficit
relative to the scaled pipelines. The small uptick at
𝑘 = 7 (74.1 %)
is merely noise:
without scaling, k-NN remains handicapped.
z-score scaling (StandardScaler).
Standardization equalizes variance and
recentres every feature at zero.
The effect is dramatic:
+26
percentage points at
𝑘 =
1
, pushing accuracy to
96.3 %
. Performance is stable across neighbourhood sizes
(≥
94 %)
, showing that once each variable contributes in “standard-deviation units,” k-
NN’s sensitivity to the choice of
𝑘
largely disappears.
Min–Max scaling.
Rescaling to
[0,1]
delivers the same
96.3 %
peak at
𝑘 = 1
.
Accuracy is robust for
𝑘 = 3– 7
but dips slightly at
𝑘 = 9 (92.6 %),
hinting that
bounded features can become overly compressed when the neighbourhood radius
grows. Still, Min–Max conveys all the benefits of scale normalization for the
𝑏𝑒𝑠𝑡 − 𝑘
setting.
On the Wine dataset, proper scaling is worth more than any downstream hyper-
parameter search: switching from unscaled inputs to either z-score or Min–Max boosts
k-NN accuracy by roughly
+26 % 𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒
—an order of magnitude larger than the
±2 %
variance you see when tweaking
𝑘
within each scaled pipeline. For distance-
based learners, scale choice is therefore not a cosmetic decision but a fundamental
determinant of predictive power.
Conclusion.
This study has shown that transforming a heterogeneous feature
space into a common metric space and then applying the k-Nearest Neighbors (k-NN)
algorithm markedly improves classification accuracy. On all three benchmark
datasets—Iris, Wine, and Breast-Cancer—bringing every numeric attribute onto the
same scale with z-score or Min–Max normalization boosted k-NN performance, with
the Wine set exhibiting an absolute gain of about
26
percentage points. A moderate
neighborhood size
(𝑘 ≈ 5– 11)
then delivered stable accuracies of
94– 96 %
,
confirming that thorough preprocessing often outweighs later hyper-parameter tuning.
For mixed-type data, embedding nominal and numeric attributes in a single Euclidean
space through a heterogeneous metric such as HVDM further enhanced accuracy,
though at the cost of higher
𝑂(𝑚𝑛)
search complexity. Because that complexity
remains, large data collections still require fast nearest-neighbor indexing structures
T A D Q I Q O T L A R
jahon ilmiy – metodik jurnali
https://scientific-jl.com
64-son_4-to’plam_Iyun-2025
89
ISSN:3030-3613
(e.g., KD-trees, FAISS, HNSW). Future research will therefore focus on designing
metrics and indexing schemes that preserve the accuracy gains of unified scaling while
alleviating the computational burden.
References
1.
Cover T.M., Hart P.E.
Nearest Neighbor Pattern Classification.
IEEE Transactions
on Information Theory 13 (1): 21–27, 1967.
2.
Wilson D.R., Martinez T.R.
Improved Heterogeneous Distance Functions.
Journal
of Artificial Intelligence Research 6: 1–34, 1997. (
3.
Hastie T., Tibshirani R., Friedman J.
The Elements of Statistical Learning: Data
Mining, Inference, and Prediction.
2nd ed., Springer, 2009.
4.
Bishop C.M.
Pattern Recognition and Machine Learning.
Springer, 2006.
5.
Jolliffe I.T., Cadima J.
Principal Component Analysis: A Review and Recent
Developments.
Philosophical Transactions of the Royal Society A 374 (2065):
20150202, 2016.
6.
Pedregosa F.
et al.
Scikit-learn: Machine Learning in Python.
Journal of Machine
Learning Research 12: 2825–2830, 2011.
7.
Johnson J., Douze M., Jégou H.
Billion-Scale Similarity Search with GPUs.
IEEE
Transactions on Big Data 7 (3): 535–547, 2021. (
8.
Malkov Y.A., Yashunin D.A.
Efficient and Robust Approximate Nearest Neighbor
Search Using Hierarchical Navigable Small World Graphs.
IEEE Transactions on
Pattern Analysis and Machine Intelligence 42 (4): 824–836, 2020.
(
9.
Cunningham P., Delany S.J.
k-Nearest Neighbour Classifiers – A Tutorial.
ACM
Computing Surveys 54 (6): 128:1–128:54, 2022.
10.
Giannopoulos P.G., Dasaklis T.K., Rachaniotis N.
Development and Evaluation of
a Novel Framework to Enhance k-NN Algorithm’s Accuracy in Data Sparsity
Contexts.
Scientific Reports 14: 25036, 2024. (
11.
Halder R.K.
et al.
Enhancing k-Nearest Neighbor Algorithm: A Comprehensive
Review and Performance Analysis of Modifications.
Journal of Big Data 11: 113,
2024. (
journalofbigdata.springeropen.com
12.
Dua D., Graff C.
UCI Machine Learning Repository.
University of California,
Irvine, 2019. (
13.
Park H.S., Pastor D.
A Comprehensive Survey on Feature Scaling Techniques for
k-Nearest Neighbor.
Pattern Recognition Letters 167: 60–66, 2023.
14.
Aggarwal C.C., Reddy C.K.
Data Clustering: Algorithms and Applications.
2nd
ed., CRC Press, 2023.
15.
Fix E., Hodges J.L.
Discriminatory Analysis: Nonparametric Discrimination,
Consistency Properties.
USAF School of Aviation Medicine, Technical Report 4,
1951.