GREENHOUSE PRODUCTIVITY ESTIMATION BASED ON THE  OPTIMIZED YOLOV5 MODEL

Oybek Eraliev; Kodirjon Rashidov; Khojiakbar Eraliev

doi:10.71337/inlibrary.uz.digital-economy.106490

Авторы

Ойбек Эралиев
Университета Инха в Ташкенте
Кодиржон Рашидов
Корейского международного университета в г.Фергана
Хожиакбар Эралиев
Ферганского политехнического институт

Биографии авторов

Ойбек Эралиев , Университета Инха в Ташкенте

PhD
Кодиржон Рашидов , Корейского международного университета в г.Фергана

Заведующий кафедрой
Хожиакбар Эралиев , Ферганского политехнического институт

Cтарший преподаватель

DOI:

https://doi.org/10.71337/inlibrary.uz.digital-economy.106490

Ключевые слова:

greenhouse Productivity YOLOv5 Optimization Tomato Detection Precision Agriculture Real-time Monitoring.

Аннотация

In modern agriculture, precision monitoring and efficient resource
management are paramount for maximizing crop yields. This research presents a novel
approach to greenhouse productivity estimation by leveraging the state-of-the-art
YOLOv5 object detection model, tailored and optimized for a custom tomato dataset.
The study focuses on detecting and classifying tomatoes into three categories-green,
pink, and red-providing a comprehensive understanding of the ripening process in realtime. The optimized YOLOv5 model demonstrated superior performance compared to
the standard version, showcasing enhanced accuracy in tomato identification. The
model was deployed in a real-world greenhouse equipped with a meticulously arranged
seven-camera system, capturing a row of tomato plants per camera. By extrapolating
the results from the single row to the entire greenhouse (comprising eight rows), an
accurate estimation of overall productivity was achieved. A web application was
developed to facilitate real-time monitoring of tomato plant states and key statistics.
The application provides insights into the percentages of green, pink, and red tomatoes,
allowing greenhouse operators to make informed decisions on resource allocation and
management. The proposed methodology offers a scalable and practical solution for
greenhouse productivity assessment, potentially revolutionizing the precision
agriculture landscape. The findings contribute to the advancement of computer vision applications in agriculture, fostering sustainable and efficient practices in greenhouse
cultivation.

597

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

GREENHOUSE PRODUCTIVITY ESTIMATION BASED ON THE

OPTIMIZED YOLOV5 MODEL

Eraliev Oybek Maripjon ugli

Teacher of Information Technology Department, Inha University in Tashkent, PhD

oybekeraliev7@gmail.com

Rashidov Kodirjon Ilkhomjon ugli

Head of International Trade Department, Korea International University in

Ferghana, teacher

uzraio2020@gmail.com

Eraliev Khojiakbar Abdinabi ugli

Teacher of Electrical Engineering Department, Ferghana Polytechnic Institute, PhD

candidate

eraliyevhojiakbar@gmail.com

Abstract:

In modern agriculture, precision monitoring and efficient resource

management are paramount for maximizing crop yields. This research presents a novel
approach to greenhouse productivity estimation by leveraging the state-of-the-art
YOLOv5 object detection model, tailored and optimized for a custom tomato dataset.
The study focuses on detecting and classifying tomatoes into three categories-green,
pink, and red-providing a comprehensive understanding of the ripening process in real-
time. The optimized YOLOv5 model demonstrated superior performance compared to
the standard version, showcasing enhanced accuracy in tomato identification. The
model was deployed in a real-world greenhouse equipped with a meticulously arranged
seven-camera system, capturing a row of tomato plants per camera. By extrapolating
the results from the single row to the entire greenhouse (comprising eight rows), an
accurate estimation of overall productivity was achieved. A web application was
developed to facilitate real-time monitoring of tomato plant states and key statistics.
The application provides insights into the percentages of green, pink, and red tomatoes,
allowing greenhouse operators to make informed decisions on resource allocation and
management. The proposed methodology offers a scalable and practical solution for
greenhouse productivity assessment, potentially revolutionizing the precision
agriculture landscape. The findings contribute to the advancement of computer vision

598

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

applications in agriculture, fostering sustainable and efficient practices in greenhouse
cultivation.

Key words: greenhouse Productivity, YOLOv5 Optimization, Tomato

Detection, Precision Agriculture, Real-time Monitoring.

OPTIMALLASHTIRILGAN YOLOV5 MODELI ASOSIDA ISSIQXONA

MAHSULDORLIGINI BAHOLASH

Eraliev Oybek Maripjon o‘g‘li

Toshkent shahridagi Inha universiteti Informatsion texnologiyalar kafedra

o‘qituvchisi, PhD

oybekeraliev7@gmail.com

Rashidov Qodirjon Ilxomjon o‘g‘li

Farg‘ona shahridagi Koreya xalqaro universiteti Xalqaro savdo kafedra mudiri,

o‘qituvchi

uzraio2020@gmail.com

Eraliev Xojiakbar Abdinabi o‘g‘li

Farg‘ona Politexnika instituti Elektr energetika kafedra katta o‘qituvchisi

eraliyevhojiakbar@gmail.com

Annotatsiya:

Zamonaviy qishloq xo'jaligida hosildorlikni maksimal darajada

oshirish uchun aniq monitoring va resurslarni samarali boshqarish muhim ahamiyatga
ega hisoblanadi. Ushbu tadqiqot pomidor o'simligi ma'lumotlar to'plami uchun
moslashtirilgan va optimallashtirilgan zamonaviy YOLOv5 ob'ektni aniqlash
modelidan foydalangan holda issiqxona mahsuldorligini baholashga yangi
yondashuvni taqdim etadi. Tadqiqot pomidorlarni uch toifaga - yashil, pushti va qizil
rangga aniqlash va tasniflashga qaratilgan - real vaqt rejimida pishib yetilish jarayonini
har tomonlama tushunishni ta'minlaydi. Optimallashtirilgan YOLOv5 modeli boshqa
standard versiyali moddellarga nisbatan yuqori unumdorlikni namoyish etib,
pomidorni aniqlashda yaxshilangan yuqori aniqlikni namoyish etadi. O'tkazilgan
tajribada Model tartibga solingan yetti kamerali tizim bilan jihozlangan haqiqiy
issiqxonada joylashtirilib har bir kamera orqali bitta qatordagi pomidor o'simliklarini
suratga olib, sinchkovlik bilan kuztuv olib borildi. Bitta qatordan olingan natijalarni
butun issiqxonaga (sakkiz qatordan iborat) ekstrapolyatsiya qilish orqali umumiy
hosildorlikni aniq baholashga erishildi. Pomidor o'simliklari holati va asosiy statistik

599

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

ma'lumotlarning real vaqt rejimida monitoringini osonlashtirish uchun veb-ilova ishlab
chiqilgan bo’lib, Ilova yashil, pushti va qizil pomidorlarning foizlari haqida ma'lumot
beradi. Bu issiqxona operatorlariga resurslarni taqsimlash va boshqarish bo'yicha ongli
qarorlar qabul qilish imkonini beradi. Taklif etilayotgan metodologiya issiqxona
mahsuldorligini baholash uchun keng ko'lamli va amaliy yechim taklif etadi. Olingan
natijalar qishloq xo'jaligida kompyuter orqali boshqarish dasturlarini rivojlantirishga,
issiqxonalar mahsuldorligini oshirishga, issiqxonalarda barqaror va samarali
amaliyotlarni rivojlantirishga yordam beradi.

Kalit so'zlar

:

issiqxona mahsuldorligi, YOLOv5 optimallashtirish, pomidorni

aniqlash, aniq qishloq xo'jaligi, real vaqt rejimida monitoring.

ОЦЕНКА ПРОДУКТИВНОСТИ ТЕПЛИЦ НА ОСНОВЕ

ОПТИМИЗИРОВАННОЙ МОДЕЛИ YOLOV5

Эралиев Ойбек Марипжон угли

Преподаватель кафедры Информационных технологий Университета Инха в

Ташкенте, PhD

oybekeraliev7@gmail.com

Рашидов Кодиржон Илхомжон угли

Заведующий кафедрой Международной торговли Корейского международного

университета в г.Фергана, преподаватель

uzraio2020@gmail.com

Эралиев Хожиакбар

Абдинаби

угли

Старший преподаватель кафедры Электроэнергетики Ферганского

политехнического института

eraliyevhojiakbar@gmail.com

Аннотация

:

В современном сельском хозяйстве точный мониторинг и

эффективное управление ресурсами имеют первостепенное значение для

максимизации урожайности. В этом исследовании представлен новый подход к

оценке продуктивности теплиц с использованием современной модели

обнаружения объектов YOLOv5, адаптированной и оптимизированной для

специального набора данных о помидорах. Исследование направлено на

обнаружение и классификацию помидоров на три категории

-

зеленые, розовые

и красные

-

что обеспечивает полное понимание процесса созревания в режиме

600

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

реального времени. Оптимизированная модель YOLOv5 продемонстрировала

превосходную производительность по сравнению со стандартной версией,

продемонстрировав повышенную точность идентификации помидоров. Модель

была развернута в реальной теплице, оснащенной тщательно продуманной

системой из семи камер, каждая из которых фиксирует ряд растений помидора.

Путем экстраполяции результатов с одного ряда на всю теплицу (состоящую из

восьми рядов) была достигнута точная оценка общей продуктивности. Было

разработано веб

-

приложение для облегчения мониторинга состояния растений

помидора и ключевых статистических данных в режиме реального времени.

Приложение предоставляет информацию о процентном соотношении зеленых,

розовых и красных помидоров, позволяя операторам теплиц принимать

обоснованные решения по распределению и управлению ресурсами.

Предлагаемая методология предлагает масштабируемое и практичное решение

для оценки продуктивности теплиц, потенциально революционизирующее

ландшафт точного земледелия. Полученные результаты способствуют развитию

приложений компьютерного зрения в сельском хозяйстве, способствуя

устойчивым и эффективным методам выращивания в теплицах.

Ключевые слова:

продуктивность теплицы, оптимизация YOLOv5,

обнаружение помидоров, точное земледелие, мониторинг в реальном времени.

INTRODUCTION

In a time marked by remarkable technological progress, the agricultural industry

is on the brink of a significant change towards sustainability, effectiveness, and
accuracy. As the cornerstone of human survival, agriculture confronts the daunting task
of nourishing a rapidly growing world population while reducing its environmental
impact. Greenhouse farming, a product of innovation, has surfaced as a hopeful remedy
to these complex issues. By offering a controlled setting, it enables farmers to grow
crops with exceptional precision, maximizing the use of resources and increasing
yields. Picture 1 illustrates the progression of agriculture from 12,000 B.C. onward,
highlighting advancements in techniques, crop cultivation, and mechanization.
Prehistoric farming relied on rudimentary tools like sticks, sickles, and manual
harvesting, alongside animal husbandry. Modern agriculture integrates technology
such as cellphones for remote crop monitoring and genetically modified seeds to
enhance crop yield and resilience against pests and diseases. This technological
evolution has likely contributed to a reduction in global food scarcity by improving
crop quality and quantity.

601

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Picture 1. Representation of the evaluation of agricultural practices

1

LITERATURE REVIEW

Modern agriculture is undergoing a transformative shift with the integration of

advanced technologies, paving the way for precision farming and enhanced
productivity. Among these technologies, computer vision-based object detection
models have emerged as powerful tools for automating crop monitoring processes.
Recently, the use of computer vision in intelligent plant and factory systems has
increased significantly [1]–[5]. Almost all production links are covered by this
technology, including raising seedlings, transplanting, managing, harvesting, and fruit
sorting. By imitating human vision, it gathers information from images, evaluates
them, and directs practical production [6], [7]. When opposed to chemical and physical
methods, machine vision captures plant data without causing any harm to the plants.
Additionally, it operates consistently, cheaply, and with excellent efficiency. An image
capture section, a data processing section, and a job execution section make up the
traditional machine vision system. The first one is utilized to take pictures and transmit
them to the following component. This section consists of image capturing cards,
optical systems, and light sources. The data processing component gathers and

1

The Picture was designed and prepared by the author.

602

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

examines data from photographs, makes a choice based on the outcomes of learning,
and then delivers instructions to the final segment. In this area, a computer system
serves as the main element. Usually a mechanical module, the job execution portion
carries out operations like watering and fruit picking. In this section, it's typical to use
robots, environmental control systems, and nutrition supply systems. Numerous
reviews have discussed the status of agricultural computer vision applications. They
mostly deal with field crops [8], [9], and just a few of them mention plant factories. A
factory's environment is complex and distinct from the outside. In addition to the actual
plant, there are irrigation pipelines, hanging ropes, mechanical devices, and other
supporting infrastructure. Computer vision applications are made more difficult by the
fact that the illumination conditions change often in accordance with the needs of the
plant.

Object detection, a subfield of computer vision, has witnessed remarkable

advancements in recent years and has found a multitude of applications in agriculture,
particularly within the context of smart farm management. The ability to accurately
identify and track objects, such as plants, pests, and diseases, has proven invaluable for
enhancing crop health, optimizing resource utilization, and reducing the environmental
impact of smart farm agriculture. This section delves into the various applications and
methodologies of object detection in smart farm environments[10].

One of the primary applications of object detection in smart farm agriculture is

the monitoring and management of plant growth. Computer vision techniques, notably
CNNs, have demonstrated their capacity to identify individual plants and monitor their
growth progress over time. This granular level of monitoring enables growers to tailor
cultivation practices to the specific needs of each plant, ensuring optimal conditions
for growth. By analyzing the data collected from object detection systems, growers can
adjust variables such as irrigation, fertilization, and lighting to maximize crop yields.
For instance, identifying overcrowded areas or uneven plant distribution allows for
timely interventions, such as thinning or rearranging plants, to ensure that each plant
receives sufficient resources and space for healthy development[11].

While object detection applications offer significant advantages for smart farm

agriculture, several challenges and limitations must be considered. These include
variations in lighting conditions, occlusions, and the diversity of plant species and
growth stages. Developing robust object detection models that can perform effectively
under these conditions remains an ongoing research challenge. Additionally, data
annotation and model training can be time-consuming and resource intensive. The need
for large and diverse annotated datasets is essential for training accurate object
detection models, and the availability of such datasets can vary across different
agricultural contexts.

603

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Based on reasonably regulated environmental conditions, plant factories are an

innovative vertical agriculture solution that constitute an advanced form of smart farm
agriculture capable of producing sustainable supply of vegetables, herbs, flowers, and
other crops year-round [12]. They also act as an urban agriculture option, bringing
high-quality, fresh, and nutritious plant products to cities so that people can eat freshly
picked vegetables [13]. Grown in smart farms and plant factories, tomatoes are prized
commodities. Tomato target detection accuracy is lowered when small-target tomato
varieties are obstructed by dense foliage of tomato plants. Furthermore, in order to
improve detection accuracy, detection models frequently rely on large, complicated
heavyweight models. These models raise the cost of manufacturing mobile and
intelligent devices by requiring a significant amount of processing power and storage.

Deep convolutional neural networks (DCNNs) have been successfully used in

agriculture in recent years, opening up new research opportunities for tomato fruit
recognition and classification using computer-vision-based DCNN detection methods.
Depending on how many detection steps there are, the DCNN target detection
techniques fall into one of two categories: (1) Two-stage detection techniques count
the image's candidate frames first, then categorize and forecast them. This kind of
detection technique is based on convolutional neural networks (CNNs) and includes
regional convolutional neural networks (RCNNs) [14], Fast RCNNs [15], Faster
RCNNs [16], and so on. Models for two-stage detection perform well in terms of recall
and precision. However, because of their enormous network size and poor operation
speed, their implementation in real-time detection scenarios is difficult. (2) One-step
detection techniques locate and categorize the target based on the features that are
directly derived from the input image. This kind of detection technique includes the
you only look once (YOLO) series [17]–[22] and the single-shot multibox detector
(SSD) [23]. Single-stage detection models may achieve real-time performance
requirements with their rapid operation speed and accuracy, matching that of two-stage
detection models, owing to their network structure design.

METHODOLOGY

This study focuses on the application of the YOLOv5 object detection model,

optimized for a custom tomato dataset, to estimate greenhouse productivity in real-
time. Recent developments in deep learning and object detection have shown
promising results in various agricultural applications. However, adapting these
techniques to the specific needs of greenhouse cultivation, particularly for tomatoes,
poses unique challenges. Achieving accurate detection and classification of tomatoes
at different ripening stages is critical for precise monitoring and resource management.
The optimization of the YOLOv5 model addresses the challenges associated with
tomato detection, providing a robust solution for greenhouse productivity estimation.

604

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Recent findings in computer vision and precision agriculture underscore the potential
impact of such models in improving crop management strategies. This research builds
upon these advancements, contributing a tailored approach for real-world
implementation in a greenhouse environment.

Challenges in this area include the need for accurate and efficient detection

across varying lighting conditions, occlusions within the dense foliage of tomato
plants, and the dynamic nature of ripening stages. The proposed model aims to tackle
these challenges, offering a reliable and scalable solution for greenhouse operators. As
agriculture strives for sustainability and resource efficiency, the integration of
optimized YOLOv5 for tomato detection stands as a significant step towards smart,
data-driven greenhouse management. This research addresses a critical gap in the
current literature and sets the stage for further advancements in precision agriculture
methodologies.

DISCUSSION AND RESULTS

YOLOv5 Object Detection Model Network Architecture:

The practice of locating

a specific object in a picture or video is known as object detection. Nowadays,
computer vision makes extensive use of object detection algorithms. For object
detection, deep learning and machine learning both employ several techniques. It is
possible to think of the difficulty of object detection in plant leaves [24]. A number of
detection techniques, including SIFT [25], Haar [26], HOG [27], and finally
convolutional features [28], have been examined in the literature. Following the feature
extraction, objects in the feature space are identified using localizers or classifiers. The
choice of an object detection technique is a critical decision in the development of any
computer vision system, particularly in the context of my research on smart farm
agriculture. The selection of YOLOv5 (You Only Look Once version 5) as the primary
object detection technique for this smart farm agriculture research is rooted in its
impressive combination of real-time processing speed, accuracy, efficiency, ease of
implementation, versatility, and strong community support. These advantages position
YOLOv5 as a pivotal tool in achieving the goals of this research, including the precise
monitoring of plants, pests, and environmental conditions, ultimately contributing to
more efficient and sustainable smart farm agriculture practices.

One of the most recent iterations of YOLO is version 5 [29]. It offers a high level

of accuracy in both detection and inference speed. The weight file for YOLOv5 is 90
percent smaller than that of YOLOv4. It is thus applicable to embedded devices for
real-time detection. When compared to earlier YOLO versions, YOLOv5 boasts quick
detection times, light weight, and great detection accuracy. Accuracy and effectiveness
are crucial when identifying plant diseases. Bell pepper plant disease detection is
enhanced by the YOLOv5 design. There are four various versions in architecture,

605

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

including the YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. The feature
extraction module and the convolutional network kernel of the four designs differ from
one another. The size of the model and the quantity of model parameters change for
each of the four architectures, which is the other difference.

In Picture 2, the YOLO framework is displayed. The basic concept of YOLO is

to divide a picture into

𝑆𝑆

×

𝑆𝑆

grids and use confidence and bounding box prediction to

identify the item (object) in each grid. Each grid must list the bounding boxes and the
accuracy ratings associated with each. The intersection over union, or IoU, is equal to
one if the bounding box that was identified fits the ground truth (GT). By doing this,
bounding boxes that differ in size from the actual box are avoided.

Picture 2. YOLOv5 model

2

The three components of the YOLOv5 model are the head, neck, and backbone, as

seen in Picture 3. At various granularities, the backbone extracts specific features from
the input image. Following the backbone's feature extraction, the neck aggregates the
data before moving on to the following layer for prediction. Ultimately, the head
guesses the class labels and creates the bounding boxes.

2

The Picture was prepared by the author based on experiments conducted in a greenhouse.

606

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Picture 3. YOLOv5 model architecture

3

The core of YOLOv5 performs slice operation by down sampling using the Focus

layer. The focus layer receives the original RGB input image, executes a slice operation
to create a 12-dimensional feature map, and then uses 32 kernels to perform a
convolution operation to create a 32-dimensional feature map. Cross Stage Partial
structures (CSP) come in two varieties in YOLOv5: CSP1_X and CSP2_X. By
lowering the calculation cost, the first CSP structure, known as CSP1_X, is used in the
backbone to obtain rich gradient information. Backbone uses spatial pyramid pooling
(SPP) to produce feature maps of a given size while preserving the accuracy of image
detection.

The main purposes of the YOLOv5 neck are to build feature pyramids, enhance the
model's ability to identify objects of different sizes, and enable recognition of the
same object at different scales. To aggregate the features, YOLOv5 use the
CSP2_X structure, the Feature Pyramid Network (FPN) [30], and the Path
Aggregation Network (PAN) [31] as the neck. The loss function and nom-max
suppression make up the YOLOv5 head. The bounding-box loss, confidence loss,
and classification loss are the three components that make up the loss function. The
generalized IoU (GIoU) is used to determine the bounding box loss [32]. When
post-processing target object detection, YOLOv5 uses weighted NMS to filter
multiple targets bounding boxes and eliminate duplicate boxes.

Hyperparameter Optimization of YOLOv5 Model:

In machine learning,

hyperparameters regulate several facets of training, and determining the ideal values
for them can be difficult. The enormous dimensionality of the search field is one reason

3

The Picture was prepared by the author using data obtained from experiments conducted in a greenhouse.

607

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

why traditional techniques like grid searches can easily become unmanageable.
Genetic algorithm (GA) is a good option for hyperparameter searches due to three
factors: 1) the costliness of assessing the fitness at each point; 2) the dimensions'
unknown correlations; and 3) both of which affect the results. About thirty
hyperparameters in YOLOv5 are employed for different training configurations.

These are specified in the /data/hyps directory's *.yaml files. Correct

initialization of these variables is crucial before evolving, since better starting
estimations will yield better ultimate results. However, several hyperparameters related
to the data augmentation part are removed in order to save model optimization time.
The main hyperparameters and their recommended values from authors and our
optimized parameters are shown in Table 1.

Table 1

Optimized hyperparameters of the YOLOv5s model

4

Warmup initial bias
learning rate

The value we aim to optimize is fitness. We define a default fitness function in

YOLOv5 as a weighted mixture of measures, where 10% of the weight is contributed
by mAP@0.5 and the rest 90% is contributed by mAP@0.5:0.95, with precision (P)
and recall (R) being absent. Evolution is carried out based on a baseline situation that
we aim to enhance for 300 generations. In this study, utilizing pretrained YOLOv5s,
COCO128 is fine-tuned for 10 epochs. Crossover and mutation are the two most
important genetic operators. In this work, mutation is employed to create new children
with an 80% chance and a 0.04 variance based on a mixture of the best parents from

4

The table was prepared by the author

608

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

all past generations. The outcomes are stored in runs/evolve/exp/evolve.csv, directory
and the most fit offspring are kept in runs/evolve/hyp_evolved.yaml directory.

Experimental Setup: Camera Installation in a Greenhouse

In the pursuit of advancing agricultural monitoring and productivity estimation,

a dedicated experimental setup has been established for the purpose of tomato detection
in an outdoor smart farm environment. This section outlines the key components and
methodologies employed in this innovative endeavor. To facilitate real-time
monitoring of tomato plants, seven FXT CCVT cameras have been strategically
installed within the smart farm. These cameras provide continuous streaming,
capturing the growth and development of one-row tomato plants. There are eight rows
of tomato plants within the smart farm. The deployment of multiple cameras ensures
comprehensive coverage, allowing for detailed observation and analysis of each plant's
status. The process of installation of cameras for tomato detection in a smart farm is
shown in Picture 4 (a). The schematic illustration of the proposed strategy for detecting
tomatoes and predicting overall productivity of the smart farm is shown in Picture 4
(b). The goal of this set up is to predict crop production prediction of the smart farm
by detecting tomatoes in a row and multiplying with the number of tomato plant’s row.

Picture 4. a) Camera installation process in a smart farm, b) schematic

illustration of camera installation

5

.

A robust object detection model has been implemented to identify and classify

tomatoes within the smart farm. The model is trained to recognize three distinct classes
of tomatoes: green, pink, and red. This multi-class classification allows for a nuanced
understanding of the ripening stages, enabling precise monitoring of the entire tomato
crop. Upon successful detection and classification of tomatoes, the system undertakes

5

The Picture was prepared by the author using data obtained from experiments conducted in a greenhouse

609

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

an essential step in estimating the overall productivity of the smart farm. By
multiplying the number of rows of tomato plants within the smart farm, the system
derives an estimate of the total tomato yield. This calculated productivity metric
provides valuable insights into the smart farm's efficiency and the potential harvest
output. This experimental setup not only enables real-time tracking of tomato growth
but also incorporates a sophisticated object detection model to categorize tomatoes
based on their ripeness. The subsequent productivity estimation serves as a critical
metric for assessing the success of smart farm operations. The implementation of this
tomato detection system represents a pioneering approach to smart farm agriculture,
promising enhanced monitoring capabilities and improved productivity assessment.
The insights gained from this experimental setup contribute to the broader landscape
of precision agriculture, where technology plays a pivotal role in optimizing crop
management and resource utilization.

Tomato Datasets:

In this research, a comprehensive dataset has been curated to

facilitate the training and evaluation of the tomato ripeness detection model. The
dataset is a crucial component, encompassing a diverse range of images collected from
various sources to ensure its representativeness and effectiveness in addressing the
research objectives. A substantial portion of the dataset comprises images gathered
from open-source repositories and publicly available datasets related to agriculture and
food. These images offer a wide variety of contexts and scenarios, contributing to the
robustness of the model. To enhance the authenticity of the dataset, real-world
examples have been incorporated, capturing images of tomatoes in different stages of
ripeness (green, pink, and red tomatoes) obtained through direct observation and
collection. These images provide a more accurate representation of the challenges
faced in real-world scenarios. A unique aspect of the dataset collection involves
crowdsourcing data through mobile devices. Contributors are encouraged to share
images of tomatoes, thereby expanding the dataset, and incorporating diverse
perspectives.

Images in the dataset exhibit a wide range of dimensions, mirroring the

variability encountered in practical applications. This diversity ensures the model's
adaptability to different image sizes and resolutions. The dataset comprises a total of
424 images, carefully curated to strike a balance between sufficiency and diversity.
The distribution of ripeness levels within the dataset ensures a comprehensive
representation of the tomato ripening process. This meticulously collected dataset
serves as a valuable resource for training and evaluating the proposed tomato ripeness
detection model, contributing to the robustness and applicability of the research
outcomes. Some samples of the tomato dataset are shown in Picture 5.

610

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Picture 5. Samples from the tomato dataset

6

.

An 8:1:1 ratio is used to divide both datasets into training, validation, and test

sets. Preprocessing techniques are performed on the training dataset. Preprocessing
photos and unifying attributes, such as image size and color, can increase the number
of samples by ensuring that parameters, such as image size, suit the needs of model
training and reduce noise during image collecting. The more often used picture
preparation techniques are resize, padding resize, and letterbox as well as image vector
normalization. The most common picture transformation techniques are flipping,
blurring, enhancing colors, enhancing edge detection, logarithmic transformation, and
image denoising. In order to enhance the number of datasets, prevent overfitting of the
model, ensure the dataset's availability, and accelerate model training in this study, a
mixture of the aforementioned strategies has been utilized to modify the gathered
photos.

Image calibration is accomplished using the Python-written LabelImg tool,

which is a graphic image annotation tool. The training set and test set photos are
annotated in YOLO format to acquire the txt files. One of them contains the target
category names and the rest of them contain target frame positions. The target frame's
coordinate location [

𝑥𝑥

𝑚𝑚𝑖𝑖𝑚𝑚

,

𝑥𝑥

𝑚𝑚𝑚𝑚𝑚𝑚

,

𝑦𝑦

𝑚𝑚𝑖𝑖𝑚𝑚

,

𝑦𝑦

𝑚𝑚𝑚𝑚𝑚𝑚

] represents

the

𝑋𝑋

and

𝑌𝑌

coordinate values

of the upper left corner and lower right corner of the target frame, respectively.

The

[

𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐

,

𝑥𝑥

𝑐𝑐𝑐𝑐𝑚𝑚𝑐𝑐𝑐𝑐𝑐𝑐

,

𝑦𝑦

𝑐𝑐𝑐𝑐𝑚𝑚𝑐𝑐𝑐𝑐𝑐𝑐

,

𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ

,

ℎ𝑒𝑒𝑤𝑤𝑒𝑒ℎ𝑤𝑤

] elements of the YOLO label data stand for

the target box's category, the

𝑋𝑋

and

𝑌𝑌

coordinates of the center point, and the width

and height of the target frame, respectively.

To make it easier to test the home-made tomato, plant and pest image dataset,

the image data is converted into data information, and the data configuration file and
model configuration file are adjusted. In Picture 26, the labeling procedure is displayed.

6

The Picture was prepared by the author using data obtained from experiments conducted in a greenhouse

611

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

The Python-based LabelImg tool is utilized for image calibration, serving as a

graphical image annotation tool. Both the training and test set images are annotated in
YOLO format to generate corresponding text files. One of these files contains the
category names, while the others contain the positional information of target frames.
The coordinates [

𝑥𝑥

𝑚𝑚𝑖𝑖𝑚𝑚

,

𝑥𝑥

𝑚𝑚𝑚𝑚𝑚𝑚

,

𝑦𝑦

𝑚𝑚𝑖𝑖𝑚𝑚

,

𝑦𝑦

𝑚𝑚𝑚𝑚𝑚𝑚

] of the target frame denote the

𝑋𝑋

and

𝑌𝑌

values

of its upper left and lower right corners, respectively. The YOLO label data comprises
[

𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐

,

𝑥𝑥

𝑐𝑐𝑐𝑐𝑚𝑚𝑐𝑐𝑐𝑐𝑐𝑐

,

𝑦𝑦

𝑐𝑐𝑐𝑐𝑚𝑚𝑐𝑐𝑐𝑐𝑐𝑐

,

𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ

,

ℎ𝑒𝑒𝑤𝑤𝑒𝑒ℎ𝑤𝑤

], representing the category of the target box,

its center coordinates, and its width and height.

To facilitate the testing of the custom dataset containing images of tomatoes,

plants, and pests, the image data is converted into data information, and adjustments
are made to the data configuration file and model configuration file. Picture 6 illustrates
the labeling process.

Picture 6. Labeling process of tomato dataset

7

.

Performance Metrics:

In this study, TensorBoard is set up to view the training

process and dynamically monitor the model's operation and training state so that
training times' effects on model performance and equipment conditions could be
seen.

Let's quickly go over the evaluation metrics that are applied to the outcomes of

the detection. Let's quickly define the following in order to do this:

-

The situation in which the detector properly identifies an instance of class

𝑤𝑤

as

7

The Picture was prepared by the author using data obtained from experiments conducted in a greenhouse

612

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

being a member of the same class is defined by a prediction designated as

𝑐𝑐

(

𝑤𝑤

,

𝑤𝑤

)

. This

can be viewed as a real plus.

-

An inaccurate classification of an instance of class

𝑗𝑗

as an instance of class

𝑤𝑤

b

y the detector is denoted by the prediction

𝑐𝑐

(

𝑗𝑗

,

𝑤𝑤

)

. This could be considered a false pos

itive instance.

-

A prediction with the label

𝑐𝑐

(

𝑤𝑤

,

𝑗𝑗

)

designates the situation in which the detecto

r misclassifies a member of class

𝑤𝑤

as a member of class

𝑗𝑗

. This could be considered a

false negative instance.

From this, the precision of the detector associated with objects of class

𝑤𝑤

is

defined as:

𝑃𝑃

𝑖𝑖

=

𝑐𝑐

(

𝑖𝑖

,

𝑖𝑖

)

∑ 𝑐𝑐

(

𝑗𝑗

,

𝑖𝑖

)

𝑗𝑗

(1)

As a result, accuracy is defined as the ratio of the number of instances of a given

class that have been correctly identified to all instances of that class that have been
detected. Following is a definition of recall:

𝑅𝑅

𝑖𝑖

=

𝑐𝑐

(

𝑖𝑖

,

𝑖𝑖

)

∑ 𝑐𝑐

(

𝑖𝑖

,

𝑗𝑗

)

𝑗𝑗

(2)

Recall, then, is the ratio of the total instances of objects of class

𝑤𝑤

available in the

dataset to the number of times an item of class

𝑤𝑤

has been successfully identified.

Let's keep in mind that the precision and recall definitions given only take class

𝑤𝑤

into account. When there are many classes, the precision and recall are calculated as

a weighted average over all classes, where the weight is typically the ratio of the
number of examples in the dataset that belong to each class

𝑤𝑤

to the total number of

instances in the dataset. Precision and recall can be synthesized using the F1 score,
defined as follows:

𝐹𝐹

1 = 2

∙

𝑃𝑃∙𝑅𝑅

𝑃𝑃+𝑅𝑅

(3)

The mean average precision (mAP) is used to rate the network's performance.

According to its definition:

𝑚𝑚𝑚𝑚𝑃𝑃

=

1

𝑁𝑁

∑ 𝑚𝑚𝑃𝑃

𝑖𝑖

(4)

Where

𝑚𝑚𝑃𝑃

𝑖𝑖

is the area under the curve produced by the precision-recall plot for

the detection of instances of class

𝑤𝑤

, which is the average precision for the class

𝑤𝑤

.

Performance of Optimized YOLOv5 on Tomato Dataset:

Table 2 provides a

comprehensive overview of the performance metrics for various YOLOv5 variants,
measured in terms of mAP, precision (P), recall (R), and F1 score. From the table,
YOLOv5l exhibits the highest mAP of 97.2 %, closely followed by the optimized
YOLOv5s at 97.4 %. This metric reflects the model's proficiency in accurately
localizing objects with a high degree of certainty. Moreover, the optimized YOLOv5s

613

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

model achieves the highest precision at 92.7%, indicating its ability to minimize false
positives and accurately classify detected objects, while YOLOv5x demonstrates the
highest recall at 96.2%, showcasing its capability to effectively identify the majority
of relevant objects within the dataset. The optimized YOLOv5s model leads in F1 score
at 93.93%, harmonizing precision and recall, showcasing its balanced performance in
object detection. These performance metrics collectively highlight the nuanced
strengths of each YOLOv5 variant. The optimized YOLOv5s model emerges as a
standout performer, emphasizing its superiority in achieving a delicate balance
between precision, recall, and overall object detection accuracy.

Table 2

A comprehensive overview of the performance metrics for various YOLOv5

variants and optimized YOLOv5s model

8

According to Picture 7, a mAP of optimized YOLOv5s model is compared to

standard YOLOv5s model. It is observed that the optimized model performance is
better than the original model.

Picture 7. Comparison of standard and optimized YOLOv5s detection model

9

.

8

The table was prepared by the author

9

The Picture was prepared by the author based on comparison of standard and optimized YOLOv5s detection model

614

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Several photos from the dataset have been selected for testing to evaluate the

model's performance, and the detection performance of the optimized YOLOv5s and
classic YOLOv5s models under various conditions is displayed in the figures below.
Picture 8 displays tomato detection under several conditions such as overlapping, very
small tomatoes. For example, the original YOLOv5s model cannot detect tiny
tomatoes, while our optimized model detects them in the first column photos from left
side. When it comes to overlapping conditions in the second and the last columns
photos, the optimized model detects overlapped tomatoes successfully, while the
original model skips the tomatoes in detecting. In addition, the original model also
faces some challenges in classification. In the third column of the photos, it classifies
the pink tomato as green tomato. In summary, when it comes to recognizing tiny and
dense targets, the optimized YOLOv5s model outperforms the standard YOLOv5s,
resulting in enhanced accuracy, detection, and identification. If a larger number of
photos are considered, the proposed algorithm's accuracy could improve.

Picture 8. Identifying tomatoes under several challenging conditions

10

.

Figure 9 depicts the various performance indicators of the proposed optimized

YOLOv5s model for 200 epochs during training and validation.

10

The Picture was prepared by the author based on comparison of standard and optimized YOLOv5s detection model

615

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Picture 9. Performance metrics of optimized model

11

The optimized YOLOv5 model has been seamlessly integrated into the outdoor

smart farm environment to address the primary objective of predicting the overall crop
production within the tomato cultivation area. The intricate task of tomato detection
involves the model processing images captured by strategically positioned cameras
along a single row of tomato plants within the smart farm. Subsequently, the model
diligently classifies the detected tomatoes into three distinct categories, namely green,
orange, and red, as visually represented in Picture 10.

Picture 10. Tomato detection results in the outdoor smart farm

12

11

The Picture was prepared by the author based on performance metrics of optimized model

12

The Picture was prepared by the author based on tomato detection results in the outdoor smart farm

616

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

The culmination of this detection process unfolds in the form of the crop

production prediction page within the bespoke web application, a transformative tool
for smart farm farmers. Figure 11 provides a visual representation of this predictive
interface. Offering real-time insights into the state of tomato plants, the top frame of
the web page serves as a dynamic window into the ongoing agricultural processes.

Picture 11. Crop production prediction page of the web application

13

Within the crop production prediction frame, the artificial intelligence model

contributes a wealth of statistical information to empower farmers. Notably, the left
side of the frame hosts a line graph in blue, depicting the total number of tomatoes
within the smart farm, while the percentage of tomatoes ready for harvesting is vividly
highlighted in red. Meanwhile, on the right side of the frame, a bar chart meticulously
delineates the distribution of total tomatoes across three distinct categories: green,

13

The Picture was prepared by the author based on Crop production prediction page of the web application

617

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

orange, and red. The prominence of the red segment within the bar chart offers farmers
a quick and intuitive estimate of the overall crop production status within the smart
farm. In essence, the deployment of the optimized YOLOv5 model coupled with the
innovative web application interface not only augments real-time monitoring
capabilities for farmers but also provides a comprehensive and visually intuitive
platform for anticipating and understanding the broader dynamics of crop production
within the smart farm.

CONCLUSION

In conclusion, this research introduces a pioneering approach to greenhouse

productivity estimation through the optimization of the YOLOv5 object detection
model for a custom tomato dataset. The study demonstrates the model's efficacy in
accurately identifying and classifying tomatoes into green, pink, and red categories,
providing a real-time assessment of the ripening process. The optimization efforts yield
superior performance compared to the standard YOLOv5 model, showcasing its
potential for widespread adoption in precision agriculture. The deployment of the
optimized YOLOv5 model in a real-world greenhouse equipped with a systematic
seven-camera array proves its practical utility. Extrapolating the results to estimate
overall productivity across eight rows of tomato plants illustrates the scalability and
reliability of the proposed methodology. The web application developed for real-time
monitoring empowers greenhouse operators with valuable insights, including the
percentages of tomatoes at different ripening stages.

This research contributes to the growing div of literature at the intersection of

computer vision and agriculture, particularly in the context of greenhouse cultivation.
The findings address current challenges in tomato detection, such as variations in
lighting conditions, occlusions, and dynamic ripening stages. As precision agriculture
continues to evolve, the optimized YOLOv5 model emerges as a valuable tool for
enhancing resource efficiency and decision-making in greenhouse management. The
success of this study not only furthers our understanding of how advanced computer
vision models can be tailored for specific agricultural contexts but also opens avenues
for future research in optimizing and extending such models for diverse crop types. As
the agricultural industry embraces technology-driven solutions, this research
contributes to the ongoing discourse on sustainable and efficient crop management
practices.

REFERENCES

1.

T. Dewi, P. Risma, and Y. Oktarina, “Fruit sorting robot based on color and

size for an agricultural product packaging system,”

Bull. Electr. Eng. Informatics

, vol.

9, no. 4, pp. 1438–1445, 2020, doi: 10.11591/eei.v9i4.2353.

618

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

2.

Z. Tian, W. Ma, Q. Yang, and F. Duan, “Application status and challenges

of machine vision in plant factory—A review,”

Inf. Process. Agric.

, vol. 9, no. 2, pp.

195–211, 2022, doi: 10.1016/j.inpa.2021.06.003.

3.

N. Schor, S. Berman, A. Dombrovsky, Y. Elad, T. Ignat, and A. Bechar,

“Development of a robotic detection system for greenhouse pepper plant diseases,”

Precis. Agric.

, vol. 18, no. 3, pp. 394–409, 2017, doi: 10.1007/s11119-017-9503-z.

4.

I. Bechar, S. Moisan, E. P. I. Pulsar, and I. S. Antipolis-mediterranee, “On-

line counting of pests in a greenhouse using computer vision,”

VAIB 2010 - Vis. Obs.

Anal. Anim. Insect Behav.

, no. August, pp. 1–4, 2010.

5.

M. V. Giuffrida, “Leaf counting from uncontrolled acquired images from

greenhouse workers”.

6.

S. Yang, L. Huang, and X. Zhang, “Research and application of machine

vision in monitoring the growth of facility seedling crops,”

Jiangsu Agric. Sci

, vol. 47,

pp. 179–187, 2019.

7.

Y. Ren, “Development of transplanting robot in facility agriculture based on

machine vision,” Dissertation, Zhejiang University, 2007.

8.

H. Tian, T. Wang, Y. Liu, X. Qiao, and Y. Li, “Computer vision technology

in agricultural automation —A review,”

Inf. Process. Agric.

, vol. 7, no. 1, pp. 1–19,

2020, doi: 10.1016/j.inpa.2019.09.006.

9.

K. Lin, J. Chen, H. Si, and J. Wu, “A review on computer vision technologies

applied in greenhouse plant stress detection,”

Commun. Comput. Inf. Sci.

, vol. 363, pp.

192–200, 2013, doi: 10.1007/978-3-642-37149-3_23.

10.

Z. Li

et al.

, “A high-precision detection method of hydroponic lettuce

seedlings status based on improved Faster RCNN,”

Comput. Electron. Agric.

, vol. 182,

no. October 2020, 2021, doi: 10.1016/j.compag.2021.106054.

11.

Z. Wu, R. Yang, F. Gao, W. Wang, L. Fu, and R. Li, “Segmentation of

abnormal leaves of hydroponic lettuce based on DeepLabV3+ for robotic sorting,”

Comput. Electron. Agric.

, vol. 190, no. August, p. 106443, 2021, doi:

10.1016/j.compag.2021.106443.

12.

L. Xi, M. Zhang, L. Zhang, T. T. S. Lew, and Y. M. Lam, “Novel Materials

for Urban Farming,”

Adv. Mater.

, vol. 34, no. 25, pp. 1–28, 2022, doi:

10.1002/adma.202105009.

13.

G. Ares, B. Ha, and S. R. Jaeger, “Consumer attitudes to vertical farming

(indoor plant factory with artificial lighting) in China, Singapore, UK, and USA: A
multi-method study,”

Food Res. Int.

, vol. 150, no. November, 2021, doi:

10.1016/j.foodres.2021.110811.

14.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies

for accurate object detection and semantic segmentation,”

Proc. IEEE Comput. Soc.

619

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

Conf. Comput. Vis. Pattern Recognit.

, pp. 580–587, 2014, doi:

10.1109/CVPR.2014.81.

15.

T. Li

et al.

, “Tomato recognition and location algorithm based on improved

YOLOv5,”

Comput. Electron. Agric.

, vol. 208, no. March, p. 107759, 2023, doi:

10.1016/j.compag.2023.107759.

16.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time

Object Detection with Region Proposal Networks,”

IEEE Trans. Pattern Anal. Mach.

Intell.

, vol. 39, no. 6, pp. 1137–1149, 2017, doi: 10.1109/TPAMI.2016.2577031.

17.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once:

Unified, real-time object detection,”

Proc. IEEE Comput. Soc. Conf. Comput. Vis.

Pattern Recognit.

, vol. 2016-Decem, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.

18.

J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” 2018,

[Online]. Available: http://arxiv.org/abs/1804.02767

19.

A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal

Speed and Accuracy of Object Detection,” 2020, [Online]. Available:
http://arxiv.org/abs/2004.10934

20.

Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO Series

in 2021,” pp. 1–7, 2021, [Online]. Available: http://arxiv.org/abs/2107.08430

21.

J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,”

Proc. -

30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017

, vol. 2017-Janua, pp.

6517–6525, 2017, doi: 10.1109/CVPR.2017.690.

22.

C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable

Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors,” pp.
7464–7475, 2023, doi: 10.1109/cvpr52729.2023.00721.

23.

W. Liu

et al.

, “SSD: Single Shot MultiBox Detector BT - Computer Vision

– ECCV 2016,” B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., Cham: Springer
International Publishing, 2016, pp. 21–37.

24.

M. P. Mathew and T. Y. Mahesh, “Leaf-based disease detection in bell

pepper plant using YOLO v5,”

Signal, Image Video Process.

, vol. 16, no. 3, pp. 841–

847, 2022, doi: 10.1007/s11760-021-02024-y.

25.

D. G. Lowe, “Object recognition from local scale-invariant features,”

Proc.

IEEE Int. Conf. Comput. Vis.

, vol. 2, pp. 1150–1157, 1999, doi:

10.1109/iccv.1999.790410.

26.

C. P. Papageorgiou, M. Oren, and T. Poggio, “General framework for object

detection,”

Proc. IEEE Int. Conf. Comput. Vis.

, pp. 555–562, 1998, doi:

10.1109/iccv.1998.710772.

27.

N. Dalal and B. Triggs, “Histograms of oriented gradients for human

detection,”

Proc. - 2005 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognition,

CVPR 2005

, vol. I, pp. 886–893, 2005, doi: 10.1109/CVPR.2005.177.

620

BUXGALTERIYA HISOBI VA AUDIT

“RAQAMLI IQTISODIYOT” ILMIY-ELEKTRON JURNALI | 7-SON

WWW.INFOCOM.UZ

28.

J. Donahue

et al.

, “DeCAF: A deep convolutional activation feature for

generic visual recognition,”

31st Int. Conf. Mach. Learn. ICML 2014

, vol. 2, pp. 988–

996, 2014.

29.

G. Yang

et al.

, “Face Mask Recognition System with YOLOV5 Based on

Image Recognition,”

2020 IEEE 6th Int. Conf. Comput. Commun. ICCC 2020

, vol. 1,

no. January 2020, pp. 1398–1404, 2020, doi: 10.1109/ICCC51575.2020.9345042.

30.

T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie,

“Feature pyramid networks for object detection,”

Proc. - 30th IEEE Conf. Comput. Vis.

Pattern Recognition, CVPR 2017

, vol. 2017-Janua, pp. 936–944, 2017, doi:

10.1109/CVPR.2017.106.

31.

S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path Aggregation Network for

Instance Segmentation,”

Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern

Recognit.

, pp. 8759–8768, 2018, doi: 10.1109/CVPR.2018.00913.

32.

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese,

“Generalized intersection over union: A metric and a loss for bounding box
regression,”

Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit.

, vol. 2019-

June, pp. 658–666, 2019, doi: 10.1109/CVPR.2019.00075.

Библиографические ссылки

T. Dewi, P. Risma, and Y. Oktarina, “Fruit sorting robot based on color and

size for an agricultural product packaging system,” Bull. Electr. Eng. Informatics, vol.

, no. 4, pp. 1438–1445, 2020, doi: 10.11591/eei.v9i4.2353.

Z. Tian, W. Ma, Q. Yang, and F. Duan, “Application status and challenges

of machine vision in plant factory—A review,” Inf. Process. Agric., vol. 9, no. 2, pp.

–211, 2022, doi: 10.1016/j.inpa.2021.06.003.

N. Schor, S. Berman, A. Dombrovsky, Y. Elad, T. Ignat, and A. Bechar,

“Development of a robotic detection system for greenhouse pepper plant diseases,”

Precis. Agric., vol. 18, no. 3, pp. 394–409, 2017, doi: 10.1007/s11119-017-9503-z.

I. Bechar, S. Moisan, E. P. I. Pulsar, and I. S. Antipolis-mediterranee, “Online counting of pests in a greenhouse using computer vision,” VAIB 2010 - Vis. Obs.

Anal. Anim. Insect Behav., no. August, pp. 1–4, 2010.

M. V. Giuffrida, “Leaf counting from uncontrolled acquired images from

greenhouse workers”.

S. Yang, L. Huang, and X. Zhang, “Research and application of machine

vision in monitoring the growth of facility seedling crops,” Jiangsu Agric. Sci, vol. 47,

pp. 179–187, 2019.

Y. Ren, “Development of transplanting robot in facility agriculture based on

machine vision,” Dissertation, Zhejiang University, 2007.

H. Tian, T. Wang, Y. Liu, X. Qiao, and Y. Li, “Computer vision technology

in agricultural automation —A review,” Inf. Process. Agric., vol. 7, no. 1, pp. 1–19,

, doi: 10.1016/j.inpa.2019.09.006.

K. Lin, J. Chen, H. Si, and J. Wu, “A review on computer vision technologies

applied in greenhouse plant stress detection,” Commun. Comput. Inf. Sci., vol. 363, pp.

–200, 2013, doi: 10.1007/978-3-642-37149-3_23.

Z. Li et al., “A high-precision detection method of hydroponic lettuce

seedlings status based on improved Faster RCNN,” Comput. Electron. Agric., vol. 182,

no. October 2020, 2021, doi: 10.1016/j.compag.2021.106054.

Z. Wu, R. Yang, F. Gao, W. Wang, L. Fu, and R. Li, “Segmentation of

abnormal leaves of hydroponic lettuce based on DeepLabV3+ for robotic sorting,”

Comput. Electron. Agric., vol. 190, no. August, p. 106443, 2021, doi:

1016/j.compag.2021.106443.

L. Xi, M. Zhang, L. Zhang, T. T. S. Lew, and Y. M. Lam, “Novel Materials

for Urban Farming,” Adv. Mater., vol. 34, no. 25, pp. 1–28, 2022, doi:

1002/adma.202105009.

G. Ares, B. Ha, and S. R. Jaeger, “Consumer attitudes to vertical farming

(indoor plant factory with artificial lighting) in China, Singapore, UK, and USA: A

multi-method study,” Food Res. Int., vol. 150, no. November, 2021, doi:

1016/j.foodres.2021.110811.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies

for accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc.

Conf. Comput. Vis. Pattern Recognit., pp. 580–587, 2014, doi:

1109/CVPR.2014.81.

T. Li et al., “Tomato recognition and location algorithm based on improved

YOLOv5,” Comput. Electron. Agric., vol. 208, no. March, p. 107759, 2023, doi:

1016/j.compag.2023.107759.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time

Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach.

Intell., vol. 39, no. 6, pp. 1137–1149, 2017, doi: 10.1109/TPAMI.2016.2577031.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once:

Unified, real-time object detection,” Proc. IEEE Comput. Soc. Conf. Comput. Vis.

Pattern Recognit., vol. 2016-Decem, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.

J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” 2018,

[Online]. Available: http://arxiv.org/abs/1804.02767

A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal

Speed and Accuracy of Object Detection,” 2020, [Online]. Available:

http://arxiv.org/abs/2004.10934

Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO Series

in 2021,” pp. 1–7, 2021, [Online]. Available: http://arxiv.org/abs/2107.08430

J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” Proc. -

th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp.

–6525, 2017, doi: 10.1109/CVPR.2017.690.

C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable

Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors,” pp.

–7475, 2023, doi: 10.1109/cvpr52729.2023.00721.

W. Liu et al., “SSD: Single Shot MultiBox Detector BT - Computer Vision

– ECCV 2016,” B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., Cham: Springer

International Publishing, 2016, pp. 21–37.

M. P. Mathew and T. Y. Mahesh, “Leaf-based disease detection in bell

pepper plant using YOLO v5,” Signal, Image Video Process., vol. 16, no. 3, pp. 841–

, 2022, doi: 10.1007/s11760-021-02024-y.

D. G. Lowe, “Object recognition from local scale-invariant features,” Proc.

IEEE Int. Conf. Comput. Vis., vol. 2, pp. 1150–1157, 1999, doi:

1109/iccv.1999.790410.

C. P. Papageorgiou, M. Oren, and T. Poggio, “General framework for object

detection,” Proc. IEEE Int. Conf. Comput. Vis., pp. 555–562, 1998, doi:

1109/iccv.1998.710772.

N. Dalal and B. Triggs, “Histograms of oriented gradients for human

detection,” Proc. - 2005 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognition,

CVPR 2005, vol. I, pp. 886–893, 2005, doi: 10.1109/CVPR.2005.177.

J. Donahue et al., “DeCAF: A deep convolutional activation feature for

generic visual recognition,” 31st Int. Conf. Mach. Learn. ICML 2014, vol. 2, pp. 988–

, 2014.

G. Yang et al., “Face Mask Recognition System with YOLOV5 Based on

Image Recognition,” 2020 IEEE 6th Int. Conf. Comput. Commun. ICCC 2020, vol. 1,

no. January 2020, pp. 1398–1404, 2020, doi: 10.1109/ICCC51575.2020.9345042.

T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie,

“Feature pyramid networks for object detection,” Proc. - 30th IEEE Conf. Comput. Vis.

Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 936–944, 2017, doi:

1109/CVPR.2017.106.

S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path Aggregation Network for

Instance Segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern

Recognit., pp. 8759–8768, 2018, doi: 10.1109/CVPR.2018.00913.

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese,

“Generalized intersection over union: A metric and a loss for bounding box

regression,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-

June, pp. 658–666, 2019, doi: 10.1109/CVPR.2019.00075.

GREENHOUSE PRODUCTIVITY ESTIMATION BASED ON THE OPTIMIZED YOLOV5 MODEL

Авторы

Биографии авторов

DOI:

Ключевые слова:

Аннотация

Библиографические ссылки

Категории

Информация

Выпуск

Раздел

Категории

Скачивания

Как цитировать

Лицензия