https://ijmri.de/index.php/jmsi
volume 4, issue 3, 2025
751
DETERMINATION OF RECURRENT ALGORITHMS AND OPTIMAL STRATEGIES
FOR CONTROLLED MARKOV CHAINS
Mamatova Zilolakhan Khabibullokhanovna
Fergana state university associate professor
pedagogy sciences Doctor of Philosophy ( PhD )
E-mail:
Orchid : 0009-0009-9247-3510
Sotvoldieva Jewelry The Lord daughter
Fergana State University
Practical mathematics direction
3rd year student, student of group 22-08
Email:
zarnigorrasuljonova@gmail.com
Abstract:
Article controlled Markov chains for recurrent algorithms and optimal strategies
determination to the issues dedicated . Managed random processes , especially industrial
of
enterprises work activity planning such as practical in the fields application seeing The states
of the system are Markov random . process with modeled , each one in case various to strategies
suitable to pass probabilities and income analysis The goal is to
one in step maximum
average
income optimal strategy that provides find . Recurring algorithms using expected
income and optimal strategies consecutively is considered . From this except , asymptotic
formulas optimal strategies using determination methods Example
as two stately and two
strategic Markov process for calculations and tables presented is done , this optimal strategies
through is determined . Article practical and theoretical in terms of controlled Markov processes
study for important source is considered .
Key Keywords:
Supervised Markov chains , recurrent algorithms , optimal strategies , random
processes , transitions probabilities , expected income , asymptotic formulas , strategies package ,
Markov process , industry planning .
Introduction:
Supervised Markov Chains random of processes important class are , they are
various in sectors including industry
enterprises planning , economic analysis , logistics
decision acceptance to do and other many practical issues in solution wide is used in these
processes . system Markov properties of states has to be , every one in case acceptance to be
done strategies transition probabilities and expected income Determines optimal strategies . to
determine , that is every one in step maximum average income providing decisions set to find
is in the field main from issues This is one of them . article controlled Markov chains for
recurrent algorithms and optimal strategies to find asymptotic methods learns . In the article
theoretical approaches practical examples with strengthened , two stately and two strategic
Markov process in the example of calculations and optimal strategies determination process This
is brought . research controllable random processes deep understanding and them practical in
https://ijmri.de/index.php/jmsi
volume 4, issue 3, 2025
752
matters application for important basis creates .
Literature analysis
Supervised Markov chains and they for recurrent algorithms field many fundamental and
practical of research topic become Markov processes foundations AA Markov formed in the
works (1906) Although the concept of Controlled Markov Chains was first proposed in the mid-
20th century , in particular by R. Bellman (1957). dynamic programming methods within
Bellman developed optimal strategies in his work " Dynamic Programming" find for recurrent
equations system offer did this
later controlled Markov processes in learning main from
approaches to one H. Howard (1960) in his book “Dynamic Programming and Markov
Processes” controlled Markov chains for iterative algorithms , in particular , value iteration and
strategy iteration methods improved . His optimal strategies for work in determining practical
calculations for important basis Later , D. Bertsekas (1995) in his book “ Dynamic Programming
and Optimal Control” this algorithms modern optimization methods with together , wide
extensive to applications adapted.Asymptotic formulas and their in controlled Markov processes
application according to research and by J. Filar and K. Vrieze (1997) in their work
“Competitive Markov Decision Processes” deep illuminated . Their work , especially long
term average
income maximize in matters asymptotic solutions to find aimed at was . With
that together , E. Altman (1999) in his work “ Constrained Markov Decision Processes” limited
conditions optimal strategies under to determine circle new approaches offer did. Local research
point of view from the point of view of Uzbekistan scientists , for example , A. Ikramov and by
T. Shirinov (2015) announcement made in articles controllable random of processes industry and
economic in systems application studied . Their in their work local industry in enterprises
optimal allocation of resources to the issues attention However , local
in literature
asymptotic methods and recurrent algorithms deep analysis according to research relatively
limited .
Research methodology
Research controlled Markov chains for recurrent algorithms and optimal strategies
Methodology following from stages consists of: Theoretical analysis : Markov processes ,
optimal strategies and recurrent algorithms literature studied.Mathematics Modeling : System
states Markov process with modeled , transition probabilities and income based on recurrent
equations was compiled.Asymptotic Methods : Long term income maximize for asymptotic
equations system solved.Practical calculations : Two situation and to the strategy Markov
process in the example of transition probabilities and income matrices optimal strategies using
was calculated .
Analysis and results
Controlled Markov processes
Supervised Markov chains random of processes important type then
system cases and
strategies based on future transitions These processes are determined . industry planning ,
economics , logistics and artificial intellect such as in the fields decision acceptance in doing
wide is used . Each in case strategy is selected , this and to go probabilities and expected income
The goal is to determine the one in step or far within the period maximum income optimal
strategies that lead to find .
In the study recurrent algorithms using expected income is considered and optimal strategies
consecutively is defined . Asymptotic methods and far term stable solutions to find help gives .
In the example two stately and two strategic system seeing
out , strategies suitable
probabilities and income tables through analysis As a result , every one situation for the most
good strategies is determined .
https://ijmri.de/index.php/jmsi
volume 4, issue 3, 2025
753
This industry theoretical in terms of solid be practical
in areas such as resources
management or economic in optimization important importance has . In the future this methods
big information analysis and automated in systems application further expansion is expected .
Importance and future directions
Supervised Markov chains theoretical in terms of solid to the base has be practical
in the
fields important solutions presented They will . resources effective management , expenses
reduce and decision acceptance to do processes in automation big importance has . In the future
this methods big information analysis , machine education
and complicated systems in
optimization application further expansion is expected . Also , in real time decision acceptance to
do and dynamic in environments flexible strategies working exit according to research important
direction as remains .
This industry scientists , engineers and practitioner experts for wide opportunities presented
because it is fundamental mathematics analysis practical problems solution with successful
unites .
Managed random processes in life various in cases wallet For example , industry
of the
enterprise work planning Let 's take each planning of time at the beginning achieved to the
situation looking at next to time plan is being prepared . When planning asset amounts of money
looking at work is seen . Active the funds of use possible was methods It 's called strategy .
Let's assume that the of the enterprise ( the enterprise) from this then ( we call it a system )
Markov random activity process with to be determined . various to strategies system various
transition probabilities and various to their income suitable is coming .
Each
strategy for transition probabilities and their income suitable accordingly
k
j
i
P
and
k
j
i
r
through Let 's define each to the situation suitable strategies to the collection has was to the
process is called a controlled Markov process . Each
i
E
situation for so
( )
m
d
i
strategy number
find the issue let's see , this strategy
m
– in step used one in passing maximum average income
g
what let him give .
These strategies package
( )
m
d
vector gives :
( )
( )
( )
( )
=
m
d
m
d
m
d
m
d
N
M
2
1
.
strategies consecutively analysis as this something determination maybe the system
(
)
1
+
m
–
optimal average in step income to give for
m
- the step should be optimal need .
=
=
=
-
+
=
N
j
j
ij
i
i
m
N
i
m
v
P
q
m
v
1
,...
2
,1
,
,1
),
1
(
)
(
(
)
( )
N
i
m
v
P
q
m
v
N
j
j
k
j
i
k
i
k
i
,1
,
max
1
1
=
+
=
+
=
Here
(
)
( )
m
v
m
v
j
i
,
1
+
s selected optimal transition for expected income indicates .
https://ijmri.de/index.php/jmsi
volume 4, issue 3, 2025
754
Expected income and optimal strategies last equality using consecutively to calculate
( )
m
d
vector unchanged until it remains continue is delivered .
Optimal strategies of controlled Markov processes
Now the optimal strategies
( )
N
i
v
mg
m
v
i
i
,...,
2
,1
,
=
+
=
asymptotic formulas using find
method looking at Let's go out .
=
=
=
-
+
=
N
j
j
ij
i
i
m
N
i
m
v
P
q
m
v
1
,...
2
,1
,
,1
),
1
(
)
(
happened for
=
-
+
=
+
N
j
j
ij
i
i
m
v
P
q
v
mg
1
)
1
(
or
[
]
=
-
+
+
=
+
N
j
j
ij
i
i
g
m
v
P
q
v
mg
1
)
1
(
PCB h axis We will do it . From now on
=
=
-
+
+
=
+
N
j
N
j
j
i
j
ij
i
i
P
g
m
v
P
q
v
mg
1
1
)
1
(
PCB and
1
1
=
=
N
j
j
i
P
that attention take
=
=
+
=
+
N
j
j
j
i
i
i
N
i
v
P
q
v
g
1
,1
,
relationship harvest we do . So so ,
1
+
N
one
N
v
v
v
g
,...,
,
,
2
1
variable
N
one equations
to the system has We are here . Optimal strategy in determining
N
j
v
j
,1
,
=
of variables
absolute values to know condition it's not .
i
j
v
v
-
differences unchanging value acceptance to
demand that enough . To achieve this difficult not the last one to the system , for
example ,
0
=
N
v
equation add enough , because desired
i
for
0
=
i
v
to assume possible .
Last system
0
=
N
v
equation with together solution if done ,
)
(
m
v
j
of size asymptotic value
0
0
2
0
1
0
,...,
,
,
N
v
v
v
g
is found .
Fixed strategies for
g
what in determining last system without taking off
https://ijmri.de/index.php/jmsi
volume 4, issue 3, 2025
755
=
=
N
i
i
i
q
P
g
1
from the formula using It is possible to find it . But this optimal strategy at the time find for no
what kind to information has We will not be . Found
0
0
2
0
1
0
,...,
,
,
N
v
v
v
g
from the solution using
various strategies assessment possible .
Example :
Markov process for system two
1
E
and
2
E
to the circumstances and two
2
,1
=
k
to
strategies has Let it be . Strategies suitable transition probabilities matrices and income matrices
as follows let it be :
=
6
,
0
4
,
0
5
,
0
5
,
0
1
P
,
-
=
8
3
3
9
1
R
,
=
3
,
0
7
,
0
2
,
0
8
,
0
2
P
,
-
=
12
1
5
4
2
R
.
Above Formulas (16), (17) are given according to For
m
=1,
m
=2 calculations comply with
respectively , Tables 1 and 2 we fill .
m
=1
Table 1
I
k
)
(
k
ij
p
)
(
k
ij
r
)
(
k
i
q
)
1
(
i
f
)
1
(
i
d
1
=
j
2
=
j
1
=
j
2
=
j
1
1
2
0.5
0.8
0.5
0.2
9
4
3
5
6
4.2
6
1
2
1
2
0.4
0.7
0.6
0.3
3
1
–8
–12
–3.6
–2.9
–2.9
2
m
=2
Table 2
I k
)
(
k
ij
p
)
1
(
)
(
j
k
ij
f
r
)
(
k
i
q
)
,1
(
k
F
i
)
2
(
i
f
)
2
(
i
d
1
=
j
2
=
j
1
=
j
2
=
j
1 1
2
0,5
0,8
0,5
0,2
3
4,8
–1,45
–0,58
6
4,2
7,55
8,42
8,42
2
2 1
2
0,4
0,7
0,6
0,3
2,4
4,2
–1,74
–0,87
–3,6
–2,9
–2,94
0,43
0,43
2
From tables visible It follows that the optimal strategy for
m
=1 is
)
2
,1
(
)
1
(
=
d
, the optimal
strategy for
m
=2 and
)
2
,
2
(
)
2
(
=
d
from vectors consists of .
Conclusion
This research controlled Markov chains for recurrent algorithms and optimal strategies
determination issues learned . Theoretical
analysis , mathematics modeling and practical
calculations through system Markov process of states with modeling , transition probabilities
and income calculation and maximum average income provider strategies find processes seeing
Asymptotic
formulas using far optimal long-term strategies Two
situation and to the
strategy has example based on calculations done increased , optimal strategies tables through
confirmed . Research results of controlled Markov processes industry planning , economic
analysis and other in the fields practical importance shows . In the future this algorithms big
information and artificial intellect in the field application
according to research expansion
https://ijmri.de/index.php/jmsi
volume 4, issue 3, 2025
756
recommendation is being done .
REFERENCES
1.
Bellman, R. (1957).
Dynamic Programming
. Princeton University Press.
2.
Howard, R. A. (1960).
Dynamic Programming and Markov Processes
. MIT Press.
3.
Bertsekas, D. P. (1995).
Dynamic Programming and Optimal Control
. Athena Scientific.
4.
Filar, J., & Vrieze, K. (1997).
Competitive Markov Decision Processes
. Springer.
5.
Altman, E. (1999).
Constrained Markov Decision Processes
. Chapman and Hall/CRC.
6.
Ikramov , A., & Shirinov , T. (2015). " Managed random of processes industry in systems
" Uzbekistan "
mathematics Journal
, 3(2), 45–52.
7.
Markov, A. A. (1906). " Distribution law more numbers on magnitudes , drug
dependent
druga . "
News Physico-mathematical society at Kazan universitete
, 15(4), 135–
156.
