DETERMINATION OF RECURRENT ALGORITHMS AND OPTIMAL STRATEGIES FOR CONTROLLED MARKOV CHAINS

Jewelry  Sotvoldieva; Zilolakhan  Mamatova

https://ijmri.de/index.php/jmsi

volume 4, issue 3, 2025

751

DETERMINATION OF RECURRENT ALGORITHMS AND OPTIMAL STRATEGIES

FOR CONTROLLED MARKOV CHAINS

Mamatova Zilolakhan Khabibullokhanovna

Fergana state university associate professor

pedagogy sciences Doctor of Philosophy ( PhD )

E-mail:

mamatova.zilolakhon@gmail.com

Orchid : 0009-0009-9247-3510

Sotvoldieva Jewelry The Lord daughter

Fergana State University

Practical mathematics direction

3rd year student, student of group 22-08

Email:

zarnigorrasuljonova@gmail.com

Abstract:

Article controlled Markov chains for recurrent algorithms and optimal strategies

determination to the issues dedicated . Managed random processes , especially industrial

of

enterprises work activity planning such as practical in the fields application seeing The states

of the system are Markov random . process with modeled , each one in case various to strategies

suitable to pass probabilities and income analysis The goal is to

one in step maximum

average

income optimal strategy that provides find . Recurring algorithms using expected

income and optimal strategies consecutively is considered . From this except , asymptotic

formulas optimal strategies using determination methods Example

as two stately and two

strategic Markov process for calculations and tables presented is done , this optimal strategies

through is determined . Article practical and theoretical in terms of controlled Markov processes

study for important source is considered .

Key Keywords:

Supervised Markov chains , recurrent algorithms , optimal strategies , random

processes , transitions probabilities , expected income , asymptotic formulas , strategies package ,

Markov process , industry planning .

Introduction:

Supervised Markov Chains random of processes important class are , they are

various in sectors including industry

enterprises planning , economic analysis , logistics

decision acceptance to do and other many practical issues in solution wide is used in these

processes . system Markov properties of states has to be , every one in case acceptance to be

done strategies transition probabilities and expected income Determines optimal strategies . to

determine , that is every one in step maximum average income providing decisions set to find

is in the field main from issues This is one of them . article controlled Markov chains for

recurrent algorithms and optimal strategies to find asymptotic methods learns . In the article

theoretical approaches practical examples with strengthened , two stately and two strategic

Markov process in the example of calculations and optimal strategies determination process This

is brought . research controllable random processes deep understanding and them practical in

https://ijmri.de/index.php/jmsi

volume 4, issue 3, 2025

752

matters application for important basis creates .

Literature analysis

Supervised Markov chains and they for recurrent algorithms field many fundamental and

practical of research topic become Markov processes foundations AA Markov formed in the

works (1906) Although the concept of Controlled Markov Chains was first proposed in the mid-

20th century , in particular by R. Bellman (1957). dynamic programming methods within

Bellman developed optimal strategies in his work " Dynamic Programming" find for recurrent

equations system offer did this

later controlled Markov processes in learning main from

approaches to one H. Howard (1960) in his book “Dynamic Programming and Markov

Processes” controlled Markov chains for iterative algorithms , in particular , value iteration and

strategy iteration methods improved . His optimal strategies for work in determining practical

calculations for important basis Later , D. Bertsekas (1995) in his book “ Dynamic Programming

and Optimal Control” this algorithms modern optimization methods with together , wide

extensive to applications adapted.Asymptotic formulas and their in controlled Markov processes

application according to research and by J. Filar and K. Vrieze (1997) in their work

“Competitive Markov Decision Processes” deep illuminated . Their work , especially long

term average

income maximize in matters asymptotic solutions to find aimed at was . With

that together , E. Altman (1999) in his work “ Constrained Markov Decision Processes” limited

conditions optimal strategies under to determine circle new approaches offer did. Local research

point of view from the point of view of Uzbekistan scientists , for example , A. Ikramov and by

T. Shirinov (2015) announcement made in articles controllable random of processes industry and

economic in systems application studied . Their in their work local industry in enterprises

optimal allocation of resources to the issues attention However , local

in literature

asymptotic methods and recurrent algorithms deep analysis according to research relatively

limited .

Research methodology

Research controlled Markov chains for recurrent algorithms and optimal strategies

Methodology following from stages consists of: Theoretical analysis : Markov processes ,

optimal strategies and recurrent algorithms literature studied.Mathematics Modeling : System

states Markov process with modeled , transition probabilities and income based on recurrent

equations was compiled.Asymptotic Methods : Long term income maximize for asymptotic

equations system solved.Practical calculations : Two situation and to the strategy Markov

process in the example of transition probabilities and income matrices optimal strategies using

was calculated .

Analysis and results
Controlled Markov processes

Supervised Markov chains random of processes important type then

system cases and

strategies based on future transitions These processes are determined . industry planning ,

economics , logistics and artificial intellect such as in the fields decision acceptance in doing

wide is used . Each in case strategy is selected , this and to go probabilities and expected income

The goal is to determine the one in step or far within the period maximum income optimal

strategies that lead to find .

In the study recurrent algorithms using expected income is considered and optimal strategies

consecutively is defined . Asymptotic methods and far term stable solutions to find help gives .

In the example two stately and two strategic system seeing

out , strategies suitable

probabilities and income tables through analysis As a result , every one situation for the most

good strategies is determined .

https://ijmri.de/index.php/jmsi

volume 4, issue 3, 2025

753

This industry theoretical in terms of solid be practical

in areas such as resources

management or economic in optimization important importance has . In the future this methods

big information analysis and automated in systems application further expansion is expected .

Importance and future directions

Supervised Markov chains theoretical in terms of solid to the base has be practical

in the

fields important solutions presented They will . resources effective management , expenses

reduce and decision acceptance to do processes in automation big importance has . In the future

this methods big information analysis , machine education

and complicated systems in

optimization application further expansion is expected . Also , in real time decision acceptance to

do and dynamic in environments flexible strategies working exit according to research important

direction as remains .
This industry scientists , engineers and practitioner experts for wide opportunities presented

because it is fundamental mathematics analysis practical problems solution with successful

unites .
Managed random processes in life various in cases wallet For example , industry

of the

enterprise work planning Let 's take each planning of time at the beginning achieved to the

situation looking at next to time plan is being prepared . When planning asset amounts of money

looking at work is seen . Active the funds of use possible was methods It 's called strategy .
Let's assume that the of the enterprise ( the enterprise) from this then ( we call it a system )

Markov random activity process with to be determined . various to strategies system various

transition probabilities and various to their income suitable is coming .

Each

strategy for transition probabilities and their income suitable accordingly

k

j

i

P

and

k

j

i

r

through Let 's define each to the situation suitable strategies to the collection has was to the

process is called a controlled Markov process . Each

i

E

situation for so

( )

m

d

i

strategy number

find the issue let's see , this strategy

m

– in step used one in passing maximum average income

g

what let him give .

These strategies package

( )

m

d

vector gives :

( )

=

m

d

m

d

m

d

m

d

N

M

2

1

.

strategies consecutively analysis as this something determination maybe the system

(

)

1

+

m

–

optimal average in step income to give for

m

- the step should be optimal need .

=

-

+

=

N

j

ij

i

m

N

i

m

v

P

q

m

v

1

,...

2

,1

,

,1

),

1

(

)

(

)

( )

N

i

m

v

P

q

m

v

N

j

k

j

i

k

i

k

i

,1

,

max

1

=

+

=

+

=

Here

(

)

( )

m

v

m

v

j

i

,

1

+

s selected optimal transition for expected income indicates .

https://ijmri.de/index.php/jmsi

volume 4, issue 3, 2025

754

Expected income and optimal strategies last equality using consecutively to calculate

( )

m

d

vector unchanged until it remains continue is delivered .

Optimal strategies of controlled Markov processes

Now the optimal strategies

( )

N

i

v

mg

m

v

i

,...,

2

,1

,

=

+

=

asymptotic formulas using find

method looking at Let's go out .

=

-

+

=

N

j

ij

i

m

N

i

m

v

P

q

m

v

1

,...

2

,1

,

,1

),

1

(

)

(

happened for

=

-

+

=

+

N

j

ij

i

m

v

P

q

v

mg

1

)

1

(

or

[

]

=

-

+

=

+

N

j

ij

i

g

m

v

P

q

v

mg

1

)

1

(

PCB h axis We will do it . From now on

=

-

+

=

+

N

j

N

j

i

j

ij

i

P

g

m

v

P

q

v

mg

1

)

1

(

PCB and

1

=

N

j

i

P

that attention take

=

+

=

+

N

j

i

N

i

v

P

q

v

g

1

,1

,

relationship harvest we do . So so ,

1

+

N

one

N

v

g

,...,

,

2

1

variable

N

one equations

to the system has We are here . Optimal strategy in determining

N

j

v

j

,1

,

=

of variables

absolute values to know condition it's not .

i

j

v

-

differences unchanging value acceptance to

demand that enough . To achieve this difficult not the last one to the system , for

example ,

0

=

N

v

equation add enough , because desired

i

for

0

=

i

v

to assume possible .

Last system

0

=

N

v

equation with together solution if done ,

)

(

m

v

j

of size asymptotic value

0

2

0

1

0

,...,

,

N

v

g

is found .

Fixed strategies for

g

what in determining last system without taking off

https://ijmri.de/index.php/jmsi

volume 4, issue 3, 2025

755

=

N

i

q

P

g

1

from the formula using It is possible to find it . But this optimal strategy at the time find for no
what kind to information has We will not be . Found

0

2

0

1

0

,...,

,

N

v

g

from the solution using

various strategies assessment possible .

Example :

Markov process for system two

1

E

and

2

E

to the circumstances and two

2

,1

=

k

to

strategies has Let it be . Strategies suitable transition probabilities matrices and income matrices

as follows let it be :

=

6

,

0

4

,

0

5

,

0

5

,

0

1

P

,

-

=

8

3

9

1

R

,

=

3

,

0

7

,

0

2

,

0

8

,

0

2

P

,

-

=

12

1

5

4

2

R

.

Above Formulas (16), (17) are given according to For

m

=1,

m

=2 calculations comply with

respectively , Tables 1 and 2 we fill .

m

=1

Table 1

I

k

)

(

k

ij

p

)

(

k

ij

r

)

(

k

i

q

)

1

(

i

f

)

1

(

i

d

1

=

j

2

=

j

1

=

j

2

=

j

1

2

0.5

0.8

0.5

0.2

9

4

3

5

6

4.2

6

1

2

1

2

0.4

0.7

0.6

0.3

3

1

–8

–12

–3.6

–2.9

2

m

=2

Table 2

I k

)

(

k

ij

p

)

1

(

)

(

j

k

ij

f

r

)

(

k

i

q

)

,1

(

k

F

i

)

2

(

i

f

)

2

(

i

d

1

=

j

2

=

j

1

=

j

2

=

j

1 1

2

0,5

0,8

0,5

0,2

3

4,8

–1,45

–0,58

6

4,2

7,55

8,42

2

2 1

2

0,4

0,7

0,6

0,3

2,4

4,2

–1,74

–0,87

–3,6

–2,9

–2,94

0,43

2

From tables visible It follows that the optimal strategy for

m

=1 is

)

2

,1

(

)

1

(

=

d

, the optimal

strategy for

m

=2 and

)

2

,

2

(

)

2

(

=

d

from vectors consists of .

Conclusion

This research controlled Markov chains for recurrent algorithms and optimal strategies

determination issues learned . Theoretical

analysis , mathematics modeling and practical

calculations through system Markov process of states with modeling , transition probabilities

and income calculation and maximum average income provider strategies find processes seeing

Asymptotic

formulas using far optimal long-term strategies Two

situation and to the

strategy has example based on calculations done increased , optimal strategies tables through

confirmed . Research results of controlled Markov processes industry planning , economic

analysis and other in the fields practical importance shows . In the future this algorithms big

information and artificial intellect in the field application

according to research expansion

https://ijmri.de/index.php/jmsi

volume 4, issue 3, 2025

756

recommendation is being done .

REFERENCES

1.

Bellman, R. (1957).

Dynamic Programming

. Princeton University Press.

2.

Howard, R. A. (1960).

Dynamic Programming and Markov Processes

. MIT Press.

3.

Bertsekas, D. P. (1995).

Dynamic Programming and Optimal Control

. Athena Scientific.

4.

Filar, J., & Vrieze, K. (1997).

Competitive Markov Decision Processes

. Springer.

5.

Altman, E. (1999).

Constrained Markov Decision Processes

. Chapman and Hall/CRC.

6.

Ikramov , A., & Shirinov , T. (2015). " Managed random of processes industry in systems

" Uzbekistan "

 mathematics Journal

, 3(2), 45–52.

7.

Markov, A. A. (1906). " Distribution law more numbers on magnitudes , drug

dependent

druga . "

News Physico-mathematical society at Kazan universitete

, 15(4), 135–

156.

DETERMINATION OF RECURRENT ALGORITHMS AND OPTIMAL STRATEGIES FOR CONTROLLED MARKOV CHAINS

Abstract

Downloads

Abstract

References