Authors

  • Zokhidjon Miratoev

DOI:

https://doi.org/10.71337/inlibrary.uz.ijai.107820

Abstract

This study investigates the application of artificial intelligence (AI) algorithms for robust shape recognition in noisy binary images, addressing challenges in medical imaging (e.g., organ segmentation in MRI scans), industrial inspection (e.g., defect detection in automotive parts), and remote sensing (e.g., object identification in satellite imagery). Three AI-based methods—Hough Transform (HT), Fourier Descriptors (FD), and Zernike Moments (ZM)—were implemented and evaluated using Python-based tools (OpenCV, Mahotas). Experimental results demonstrate that Zernike Moments achieve the highest accuracy (95%) in high-noise conditions, Fourier Descriptors excel in reconstructing complex contours, and Hough Transform is fastest for detecting basic geometric shapes. A hybrid approach integrating these methods with deep learning, such as Convolutional Neural Networks (CNNs), is proposed to enhance accuracy and scalability.

 

 

background image

INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE

ISSN: 2692-5206, Impact Factor: 12,23

American Academic publishers, volume 05, issue 05,2025

Journal:

https://www.academicpublishers.org/journals/index.php/ijai

page 1867

SHAPE RECOGNITION IN NOISY IMAGES USING AI ALGORITHMS

Zokhidjon Miratoev

Assistant, Department of Mathematics and Natural Sciences,

Almalyk Branch of TSTU, Uzbekistan

Email:

miratoyev2014@gmail.com

Abstract:

This study investigates the application of artificial intelligence (AI) algorithms for

robust shape recognition in noisy binary images, addressing challenges in medical imaging (e.g.,

organ segmentation in MRI scans), industrial inspection (e.g., defect detection in automotive

parts), and remote sensing (e.g., object identification in satellite imagery). Three AI-based

methods—Hough Transform (HT), Fourier Descriptors (FD), and Zernike Moments (ZM)—

were implemented and evaluated using Python-based tools (OpenCV, Mahotas). Experimental

results demonstrate that Zernike Moments achieve the highest accuracy (95%) in high-noise

conditions, Fourier Descriptors excel in reconstructing complex contours, and Hough

Transform is fastest for detecting basic geometric shapes. A hybrid approach integrating these

methods with deep learning, such as Convolutional Neural Networks (CNNs), is proposed to

enhance accuracy and scalability.

Keywords:

Shape Recognition, Noisy Images, Hough Transform, Fourier Descriptors, Zernike

Moments, Image Processing, Python, Pattern Recognition, Convolutional Neural Networks.

1. Introduction

Shape recognition is a cornerstone of computer vision, enabling applications such as

organ segmentation in medical imaging, defect detection in industrial manufacturing, and object

identification in autonomous navigation. Real-world images are often corrupted by noise, such

as Gaussian noise in MRI scans or speckle noise in satellite imagery, which distorts object

boundaries and challenges traditional geometric methods. AI-based techniques leverage

invariant properties to achieve robust shape recognition despite noise, rotation, and scale

variations.

This study evaluates three AI-based shape recognition methods—Hough Transform (HT),

Fourier Descriptors (FD), and Zernike Moments (ZM)—in noisy binary image environments.

The methods were tested on a synthetic dataset simulating real-world distortions, with the goal

of identifying their strengths and proposing a hybrid framework combining classical descriptors

with deep learning for enhanced performance.

2. Literature Review

Gonzalez and Woods (2018) provided foundational techniques for spatial and frequency

domain image preprocessing, critical for noise handling. Ballard (1981) extended the Hough

Transform to detect arbitrary shapes, improving object detection capabilities. Khotanzad and

Hong (1990) introduced Zernike Moments for invariant shape recognition using orthogonal

polynomials. Teague (1980) developed moment-based descriptors via general moment theory,

enabling complex shape representation.

Recent advancements include hybrid models integrating classical descriptors with deep

learning. Sonka et al. (2014) emphasized combining spatial and frequency domain analyses for

robustness in noisy environments. Dosovitskiy et al. (2021) introduced Vision Transformers

(ViT), which outperform traditional CNNs in certain tasks. Liu et al. (2022) proposed


background image

INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE

ISSN: 2692-5206, Impact Factor: 12,23

American Academic publishers, volume 05, issue 05,2025

Journal:

https://www.academicpublishers.org/journals/index.php/ijai

page 1868

EfficientNetV2, a lightweight CNN architecture balancing accuracy and efficiency. These

developments highlight the potential of hybrid approaches for robust shape recognition.

3. Methodology

3.1 Dataset and Experimental Setup

A synthetic dataset of 10 binary images was generated, comprising 3 circles, 3 squares, 2

stars, and 2 triangles to represent diverse shape complexities. Each image was corrupted with

Gaussian noise, defined by the probability density function:

p x, y =

1

2πσ

2

exp −

x − μ

2

+ y − μ

2

2

, μ = 0, σ

2

∈ [0.01, 0.1]

The noise levels were chosen to reflect variances in MRI scans

2

≈ 0.01 − 0.05)

and

satellite imagery

2

≈ 0.05 − 0.1)

Other noise types (e.g., salt-and-pepper) were not included

but are planned for future work.

Python 3.11 was used with the following libraries:

OpenCV (cv2) for edge detection and contour analysis.

NumPy for numerical computations.

Mahotas for Zernike Moments extraction.

Matplotlib for visualization.

Experiments were conducted on a Windows machine with 16 GB RAM and an Intel Core

i7 processor.

3.2 Shape Recognition Techniques

3.2.1 Hough Transform (HT)

The Hough Transform identifies geometric shapes (e.g., lines, circles) in edge-detected

images using a voting mechanism in parameter space. For lines, the polar form is:

ρ = xcosθ + ysinθ

where

ρ

is the perpendicular distance from the origin, and

θ ∈ [0, π]

is the angle of the line. For

circles, the equation is:

x − a

2

+ y − b

2

= r

2

where

a, b

is the circle center, and

r

is the radius. The implementation is:

edges = cv2.Canny(image, 50, 150)

lines = cv2.HoughLines(edges, 1, np.pi / 180, 100)

HT is robust to partial occlusion and excels with simple shapes.

3.2.2 Fourier Descriptors (FD)

Fourier Descriptors represent a shape’s boundary using the Discrete Fourier Transform

(DFT). For a contour with points

x t , y(t)

, the complex representation is:

z t = x t + jy t , t = 0,1, …, N − 1

The DFT coefficients are:

c

n

=

1

N

t=0

N−1

z t exp −j

2πnt

N , n = 0, 1, …, N − 1

The shape is reconstructed using the inverse DFT:

z t =

t=0

N−1

c

n

exp j

2πnt

N

The first 10 coefficients are normalized for rotation invariance:


background image

INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE

ISSN: 2692-5206, Impact Factor: 12,23

American Academic publishers, volume 05, issue 05,2025

Journal:

https://www.academicpublishers.org/journals/index.php/ijai

page 1869

c

n

'

=

c

n

c

1

The implementation is:

contours,

_

=

cv2.findContours(image,

cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_NONE)

contour = contours[0].reshape(-1, 2)

complex_contour = contour[:, 0] + 1j * contour[:, 1]

descriptors = np.fft.fft(complex_contour)

descriptors = descriptors / np.abs(descriptors[1]) # Normalize

FDs are effective for contour-based classification.

3.2.3 Zernike Moments (ZM)

Zernike Moments are computed within the unit disk to extract rotation-invariant shape features.

The moment is defined as:

Z

n,m

=

n + 1

π

x

2

+y

2

≤1

f(x, y) ∙ V

n,m

x, y dxdy

where

f(x, y)

is the image intensity function (e.g., 0 or 1 for binary images), and the Zernike

polynomial is:

V

n,m

x, y = R

n,m

(ρ) ∙ exp (jmθ)

The radial polynomial is:

R

n,m

ρ =

k=0

n− m )/2

−1

k

n − k !

k! ∙

n + m

2

− k ! ∙

n − m

2

− k !

ρ

n−2k

Here:

ρ = x

2

+ y

2

≤ 1:

Radial distance in the unit disk.

θ = tan

−1 y

x

:

Angular coordinate.

n:

Polynomial order (

n ≥ 0,

integer).

m:

Repetition index (

m ≤ n, n − m

even).

V

n,m

x, y = R

n,m

ρ ∙ exp −jmθ :

Complex conjugate of the Zernike polynomial.

n+1

π

:

Normalization factor ensuring scale invariance.

For numerical computation, the image is mapped to the unit disk by normalizing pixel

coordinates:

x

'

=

x−x

c

r

,

y

'

=

y−y

c

r

,

x'

2

+ y'

2

≤ 1

where

x

c

, y

c

is the image center, and

r

is the radius (e.g., 21 pixels). The implementation is:

import mahotas

features = mahotas.features.zernike_moments(image, radius=21, degree=8)

The

radius = 21

parameter scales the image to the unit disk, and

degree = 8

computes

moments up to order

n = 8

Zernike Moments are robust to noise and rotation, effectively

capturing global and local shape characteristics.

3.3 Evaluation Criteria

Accuracy: Percentage of correctly identified shapes.

Execution Time: Time to extract features and classify shapes.


background image

INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE

ISSN: 2692-5206, Impact Factor: 12,23

American Academic publishers, volume 05, issue 05,2025

Journal:

https://www.academicpublishers.org/journals/index.php/ijai

page 1870

Noise Robustness: Consistency across noise levels

σ

2

= 0.01 to 0.1

4. Results and Discussion

4.1 Recognition Accuracy

Method

Accuracy

Speed (s)

Key Advantages

Hough Transform

92%

0.35

Fast, effective for basic shapes

Fourier Descriptors

88%

0.15

Efficient for complex contours

Zernike Moments

95%

0.60

Most accurate, noise-resistant

Zernike Moments achieved 95% accuracy at

σ

2

= 0.05

, excelling with complex shapes

(stars, triangles). Fourier Descriptors maintained 88% accuracy, performing well for squares

and triangles but losing precision at

σ

2

≥ 0.07

. Hough Transform was fastest (0.35 s) for

circles and lines but struggled with distorted stars at

σ

2

≥ 0.05

.

4.2 Visual Comparison

Figure 1 illustrates recognition accuracy across noise levels. Zernike Moments maintained

structural integrity, while Fourier Descriptors showed minor boundary precision loss. Hough

Transform struggled with composite shapes.

4.3 Hybrid System Proposal

A hybrid model combining Zernike Moments with CNNs (e.g., ResNet-50) is proposed

for precision-critical applications, such as organ segmentation. For real-time systems, Fourier

Descriptors with SVMs are recommended. Early feature fusion could enhance generalization.

4.4 Limitations

The study used a small synthetic dataset (10 images), limiting generalizability. Only

Gaussian noise was considered, excluding other types like salt-and-pepper or speckle noise.

Computational constraints restricted the dataset size.

5. Conclusion

Zernike Moments are optimal for high-accuracy applications (e.g., medical imaging) due

to their noise resistance (95% at

σ

2

= 0.05

). Hough Transform is ideal for rapid detection of


background image

INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE

ISSN: 2692-5206, Impact Factor: 12,23

American Academic publishers, volume 05, issue 05,2025

Journal:

https://www.academicpublishers.org/journals/index.php/ijai

page 1871

simple shapes, while Fourier Descriptors balance speed and complexity. A hybrid model

integrating these descriptors with neural networks offers a promising path for robust shape

recognition.

Future research will implement transfer learning with pretrained CNNs (e.g., ResNet-50,

MobileNetV2) on datasets like MNIST Shapes and industrial defect datasets. Exploring Vision

Transformers and diverse noise models (e.g., speckle noise) will enhance robustness.

References:

1. Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson.

2. Ballard, D. H. (1981). Generalizing the Hough Transform to detect arbitrary shapes. Pattern

Recognition, 13(2), 111–122.

3. Khotanzad, A., & Hong, Y. H. (1990). Invariant image recognition by Zernike moments.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(5), 489–497.

4. Teague, M. R. (1980). Image analysis via the general theory of moments. Journal of the

Optical Society of America, 70(8), 920–930.

5. Sonka, M., Hlavac, V., & Boyle, R. (2014). Image Processing, Analysis, and Machine

Vision (4th ed.). Cengage.

6. Dosovitskiy, A., et al. (2021). An image is worth 16x16 words: Transformers for image

recognition at scale. International Conference on Learning Representations (ICLR).

7. Liu, Z., et al. (2022). A ConvNet for the 2020s. Proceedings of the IEEE/CVF Conference

on Computer Vision and Pattern Recognition (CVPR).

8. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale

image recognition. ICLR.

9. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep

convolutional neural networks. Advances in Neural Information Processing Systems

(NeurIPS).

10. Dmitry, S., Sadykov, S., Samandarov, I., Dushatov, N., & Miratoev, Z. (2024). METHOD

OF INVESTIGATION OF STABILITY AND INFORMATIVENESS OF BASIC AND

DERIVATIVE

FEATURES

OF

ANALYSIS

OF

MICROSCOPIC

AND

DEFECTOSCOPIC IMAGES OF CAST IRON MICROSTRUCTURE. Universum:

технические науки, 10(11 (128)), 31-39.

11. Буланова Ю.А., Садыков С.С., Самандаров И.Р., Душатов Н.Т., Миратоев З.М.

Исследования методов повышения контраста маммографических снимков. Oriental

renaissance: Innovative, educational, natural and social sciences. 2022. Vol. 2. No. 10. pp.

304-315.

12. Самандаров И.Р., Маншуров Ш.Т., Душатов Н.Т., Миратоев З.М., Мустафин Р.Р.

Обработка изображений в С++ с помощью библиотеки OpenCV // Universum:

технические науки.-2023- № 5(110).

References

Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson.

Ballard, D. H. (1981). Generalizing the Hough Transform to detect arbitrary shapes. Pattern Recognition, 13(2), 111–122.

Khotanzad, A., & Hong, Y. H. (1990). Invariant image recognition by Zernike moments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(5), 489–497.

Teague, M. R. (1980). Image analysis via the general theory of moments. Journal of the Optical Society of America, 70(8), 920–930.

Sonka, M., Hlavac, V., & Boyle, R. (2014). Image Processing, Analysis, and Machine Vision (4th ed.). Cengage.

Dosovitskiy, A., et al. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR).

Liu, Z., et al. (2022). A ConvNet for the 2020s. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. ICLR.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (NeurIPS).

Dmitry, S., Sadykov, S., Samandarov, I., Dushatov, N., & Miratoev, Z. (2024). METHOD OF INVESTIGATION OF STABILITY AND INFORMATIVENESS OF BASIC AND DERIVATIVE FEATURES OF ANALYSIS OF MICROSCOPIC AND DEFECTOSCOPIC IMAGES OF CAST IRON MICROSTRUCTURE. Universum: технические науки, 10(11 (128)), 31-39.

Буланова Ю.А., Садыков С.С., Самандаров И.Р., Душатов Н.Т., Миратоев З.М. Исследования методов повышения контраста маммографических снимков. Oriental renaissance: Innovative, educational, natural and social sciences. 2022. Vol. 2. No. 10. pp. 304-315.

Самандаров И.Р., Маншуров Ш.Т., Душатов Н.Т., Миратоев З.М., Мустафин Р.Р. Обработка изображений в С++ с помощью библиотеки OpenCV // Universum: технические науки.-2023- № 5(110).