Predictive or prognostic factors for breast cancer

Dilnoza Mukhamedieva; Mansur Khamraev

doi:10.47689/2181-1415-vol6-iss4/S-pp72-79

Авторы

Дилноза Мухамедиева
Профессор, Национальный исследовательский университет-Ташкентский институт инженеров ирригации и механизации сельского хозяйства
Мансур Хамраев
Студент, Ташкентский университет информационных технологий имени Мухаммада ал-Хоразмий

DOI:

https://doi.org/10.47689/2181-1415-vol6-iss4/S-pp72-79

Ключевые слова:

Выявление рака молочной железы Искусственный интеллект Машинное обучение Глубокое обучение Медицинская визуализация Сверточные нейронные сети Машины опорных векторов Случайный лес Метод k-ближайших соседей Искусственные нейронные сети Рекуррентные нейронные сети Извлечение признаков Ранняя диагностика

Аннотация

Рак молочной железы является одним из самых распространенных и опасных для жизни заболеваний среди женщин во всем мире. Ранняя диагностика играет важную роль в повышении выживаемости и улучшении эффективности лечения. В данной статье представлены ключевые факторы прогнозирования и анализа, которые имеют важное значение для разработки алгоритмов и программных инструментов для раннего выявления рака молочной железы с использованием технологий искусственного интеллекта (AI) и машинного обучения (ML).

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Journal home page:

https://inscience.uz/index.php/socinov/index

Predictive or prognostic factors for breast cancer

Dilnoza MUKHAMEDIEVA

1

, Mansur KHAMRAEV

2

National Research University-Tashkent Institute of Irrigation and Agricultural Mechanization

Engineers, Tashkent University of Information Technologies named after Muhammad al-Khwarizmi.

ARTICLE INFO

ABSTRACT

Article history:

Received March 2025
Received in revised form

15 April 2025
Accepted 25 April 2025

Available online

25 May 2025

Breast cancer is one of the most common and life-

threatening diseases among women worldwide. Early diagnosis

plays a crucial role in increasing survival rates and improving

treatment effectiveness. This article presents the key factors of
prediction and analysis that are essential for developing

algorithms and software tools for the early detection of breast

cancer using artificial intelligence (AI) and machine learning

(ML) techniques.

2181-

1415/©

2025 in Science LLC.

DOI:

https://doi.org/10.47689/2181-1415-vol6-iss4/S-pp

72-79

This is an open access article under the Attribution 4.0 International
(CC BY 4.0) license (https://creativecommons.org/licenses/by/4.0/deed.ru)

Keywords:

Breast cancer detection,
Artificial Intelligence,

Machine Learning,

Deep Learning,

Medical Imaging,

Convolutional Neural
Networks,

Support Vector Machines,

Random Forest,

K-Nearest Neighbors,
Artificial Neural Networks,
Recurrent Neural Networks,

Feature Extraction,

Early Diagnosis.

1

Professor, National Research University-Tashkent Institute of Irrigation and Agricultural Mechanization Engineers.

E-mail: d.muhamediyeva@tiiame.uz

2

Student, Tashkent University of Information Technologies named after Muhammad al-Khwarizmi.

E-mail: mxamrayev888@gmail.com

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Special Issue

–

04 (2025) / ISSN 2181-1415

73

Ko‘krak

bezi

saratonining

bashoratlovchi

yoki

prognozlovchi omillari

ANNOTATSIYA

Kalit so

‘

zlar:

Ko‘krak bezi saratonini

aniqlash,

Sun'iy intellekt,

Mashinani o‘rganish,

Chuqur o‘rganish,

Tibbiy tasvirlash,
Konvolyutsion neyron
tarmoqlar,

Tayanch vektor mashinalari,

Tasodifiy o‘rmon,

K-

eng yaqin qo‘shnilar,

Sun'iy neyron tarmoqlar,

Rekurrent neyron
tarmoqlar,

Xususiyatlarni ajratib olish,

Erta tashxis.

Ko‘krak bezi saratoni dunyo bo‘ylab ayollar orasida eng keng

tarqalgan va hayot uchun xavfli kasalliklardan biri hisoblanadi.

Erta tashxis qo‘yish omon qolish darajasini oshirish va

davolash

samaradorligini yaxshilashda muhim rol o‘ynaydi. Ushbu

maqolada sun'iy intellekt (AI) va mashinani o‘rganish (ML)
texnikalaridan foydalangan holda ko‘krak bezi saratonini erta

aniqlash uchun algoritm va dasturiy vositalar uchun muhim
bo'lgan bashoratlash va tahlil qilishning asosiy faktorlari taqdim

etiladi.

Прогностические или предиктивные факторы рака
молочной железы

АННОТАЦИЯ

Ключевые слова:

Выявление рака молочной
железы,

Искусственный интеллект,
Машинное обучение,
Глубокое обучение,

Медицинская
визуализация,

Сверточные нейронные
сети,

Машины опорных
векторов,

Случайный лес,

Метод k

-

ближайших

соседей,

Искусственные

нейронные сети,
Рекуррентные нейронные
сети,

Извлечение признаков,
Ранняя диагностика

.

Рак молочной железы является одним из самых

распространенных и опасных для жизни заболеваний среди

женщин во всем мире. Ранняя диагностика играет важную

роль

в

повышении

выживаемости

и

улучшении

эффективности лечения. В данной статье представлены
ключевые факторы прогнозирования и анализа, которые

имеют важное значение для разработки алгоритмов и

программных инструментов для раннего выявления рака

молочной

железы

с

использованием

технологий

искусственного интеллекта (AI) и

машинного обучения (ML).

INTRODUCTION

Breast cancer detection at an early stage significantly increases the chances of

successful treatment. Traditional diagnostic methods, such as mammography and biopsy,
have limitations in terms of accuracy and accessibility. With the rapid advancements in

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Special Issue

–

04 (2025) / ISSN 2181-1415

74

AI, computational techniques can assist healthcare professionals by providing reliable
and automated diagnostic support.

METHODOLOGY

Biomarkers in breast cancer
Tissue biomarkers have gained importance in personalized medicine, aiding in

disease diagnosis, prognosis prediction, and selection of patients who may derive specific
therapeutic benefits. Breast cancer management involves key tissue biomarkers,
including ER, progesterone receptor (PgR), human epidermal growth factor receptor 2
(HER2), and Ki-67. Ongoing research has investigated novel biomarkers, such as
programmed death ligand-1 (PD-L1) and tumor-infiltrating lymphocytes (TIL). Despite
the importance of biomarker assessment, several studies have demonstrated intra- and
inter-laboratory variability in the assessment of ER, PgR, HER2, and Ki-67. This could
influence treatment decisions regarding hormonal and anti-HER2 targeted therapies.

AI in biomarkers of breast cancer
Assessing hormone receptor (HR) status via IHC can help identify patients who are

likely to benefit from endocrine therapies, such as tamoxifen. Samples with at least 1%
ER- or PgR-positive tumor nuclei were deemed positive, with quantification achievable
by reporting the percentage of positive cells or utilizing scoring systems, such as the
Allred or H-score.

Since then, several studies have explored automated quantitative digital imaging

analysis (DIA) techniques. Although manual interpretation of IHC staining is inherently
subjective and time-consuming, studies have shown a strong correlation between manual
and DIA-based scoring of ER and PgR IHC staining in breast cancer. Notably, the
utilization of DIA has demonstrated improved reproducibility compared with manual
scoring methods. Moreover, efforts have been made to integrate DIA algorithms into
routine digital pathology workflows and ensure the robustness of AI models, and
promising results have been reported. Additionally, some AI models are promising in
predicting ER and PgR status using only H&E-stained slides, thereby eliminating the need
for specific IHC staining. Notably, a DL model based on ShuffleNet was developed to
predict molecular alterations and biomarkers in various solid tumor types, including
breast cancer.

HER2 status, determined by IHC with or without in situ hybridization (ISH), is

essential for identifying candidates for anti-HER2 therapies, such as trastuzumab. Levels
of HER2 are classified based on the proportion and intensity of stained tumor cells. In an
effort to quantify HER2 IHC slides, a study reported an overall agreement of 92.3%
between software-based analysis and pathologist assessment by evaluating cell
membrane connectivity. Another study demonstrated the potential of an AI-powered
HER2 analyzer to mitigate interobserver variability and aid pathologists in achieving a
consistent evaluation of HER2 expression levels. Furthermore, several studies have used
AI models to predict the amplification state of HER2 by analyzing digitized fluorescence
ISH (FISH) images.

Subsequently, several AI models have been developed to exclusively predict the

HER2 status using H&E-stained slides, including the HEROCHE challenge
[119,121,122,129,130]. In this challenge, 21 international teams presented their AI
models, and the best-performing model exhibited a classification accuracy of 0.68 in

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Special Issue

–

04 (2025) / ISSN 2181-1415

75

terms of F1 score. It is important to note that the choice of evaluation metric may
influence the performance of the models. Beyond simply predicting the HER2 status,
some studies have shown associations between AI models and therapeutic responses.
Farahmand et al. developed a CNN classifier, achieving an AUC of 0.80 in predicting the
response to trastuzumab therapy based on HER2 status. Another intriguing application of
AI was observed in patients who achieved a pathologic complete response (pCR) after
NAC with anti-HER2 agents, revealing a notably higher proportion of tumor cells with
intense HER2 staining. This insight suggests that AI models may be pivotal in providing
nuanced information for predicting responses in patients with HER2-positive early
breast cancer undergoing NAC.

Furthermore, quantitative assessment of HER2 IHC using AI algorithms is not

limited to breast cancer. It has demonstrated promise in reducing inter-observer
variability and forecasting prognosis or treatment outcomes in other cancer types,
including urothelial carcinoma, biliary tract cancer, and colorectal cancer.

Despite the consistent treatment benefit of cyclin-dependent kinase 4 and 6

inhibitors demonstrated in a recent phase III clinical trial regardless of the Ki-67 index,
Ki-67 serves as a cell proliferation marker and prognostic and predictive biomarker for
breast cancer. However, the reproducibility of Ki-67 assessment remains a longstanding
challenge.

Similar to other biomarker evaluations, Ki-67 is also evaluated using IHC, and

several DIA platforms showed promising results. A comparative study revealed excellent
reproducibility among the three DIA platforms and reference standards, with the
platforms demonstrating indistinguishable capabilities for predicting patient outcomes in
breast cancer. Furthermore, another study revealed that incorporating AI support in the
evaluation of Ki-67, ER, and PgR expression led to a slight improvement in pathologist
agreement, with 95.8% of the AI analysis results for Ki-67 confirmed by most of the
pathologists.

Recently, immunotherapy combined with chemotherapy has demonstrated

efficacy in specific breast cancer subsets, necessitating the use of predictive biomarkers
like PD-L1. Validation of PD-L1 IHC expression was provided by the KEYNOTE-355 trial,
revealing improved survival outcomes in patients with metastatic triple-negative breast
cancer (TNBC) exhibiting a PD-

L1 combined positive score ≥ 10.

In terms of applying AI, a study utilizing a digital pathology platform for PD-L1

scoring in breast cancer showed that an AI algorithm could predict chemotherapy
outcomes in patients with TNBC, as well as in those with HER2-positive and ER-negative
breast cancer. The potential utility of AI as an aid to pathologists was highlighted in a
multi-institutional ring study that showed that the degree of agreement among
pathologists when assessing PD-L1 expression levels could be improved with AI
assistance. Moreover, the DL model was able to predict PD-L1 status from H&E-stained
images, indicating the potential role of AI in clinical practice for decision support and
quality assurance. AI can enhance patient management strategies by identifying cases
susceptible to misinterpretation [144]. A representative example of the application of AI
in PD-L1 assessment is shown in Figure 1A.

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Special Issue

–

04 (2025) / ISSN 2181-1415

76

Figure 1. Example of artificial intelligence application in whole slide images.
The significance of TILs within the tumor microenvironment (TME) continues to

increase because of their correlation with improved prognosis and their predictive value
for chemotherapy and immunotherapy responses in breast cancer. However, the
concordance rate for manual TILs assessment among pathologists remains suboptimal.

Several computational approaches have been suggested to address interobserver

variability, including the recommendation of the International Immuno-Oncology
Working Groups to incorporate a computational approach in TIL assessment.
Additionally, one AI model proposed novel immunogradient indicators by computing TIL
density profiles across the tumor-stroma interface zone, demonstrating robust
prognostic stratification that outperforms traditional clinical and pathologic variables.

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Special Issue

–

04 (2025) / ISSN 2181-1415

77

Another AI model quantified stromal TIL scores and provided valuable assistance to
pathologists, particularly when discordant interpretations arose. This model enhanced
the concordance rate among the pathologists. Furthermore, the prediction of NAC
response in patients with TNBC and HER2-positive breast cancer has been enhanced
with the assistance of AI. Using an identical AI model, the density of TIL was spatially
analyzed, leading to determination of the immune phenotype (IP). One study revealed
varying TIL densities and IPs across different molecular subtypes of breast cancer,
suggesting a distinct immune landscape. A representative example of the application of
AI to the spatial analysis of TIL is shown in Figure 1B. An additional AI model has
proposed digital stromal TILs and digital tumor-associated stroma scores, based on the
spatial relationships among TME components, showing prognostic significance in
predicting disease-specific survival in patients with TNBC.

Beyond breast cancer, AI-powered TIL spatial evaluations are gaining traction in

colorectal cancer, with promising implications for anti-HER2 therapy response
prediction. This AI algorithm also enables the assessment of macrophage and fibroblast
cell densities within the TME, potentially forecasting anti-HER2 therapy outcomes.
Another pan-carcinoma investigation revealed diminished intratumoral and stromal TIL
densities in HER2-amplified tumors, as assessed using an AI model, alluding to a
correlation between HER2 amplification and low immune infiltration.

AI for breast cancer risk stratification and genetic alteration prediction
Mammographic density, measured using the Breast Imaging Reporting and Data

System (BI-RADS) category, has been investigated extensively, and it has been found that
breast density is a strong risk factor for breast cancer. Consequently, new breast
screening strategies, such as those explored in the Dense Tissue and Early Breast

Neoplasm Screening trials, now consider a woman’s breast density to evaluate her risk.

However, the current standard for measuring breast density relies heavily on the
subjective judgment of radiologists, which leads to significant inter-reader variability.
This highlights the need for more objective and standardized approaches for assessing
breast density to enhance screening accuracy and consistency.

Objective and consistent density measurements are crucial for individual risk

stratification, leading to the development of automated assessment tools, such as
Volpara, which calculates the volumetric breast density percentage of each mammogram
on a continuous scale. Another alternative is to develop density AI models trained using
labeled data provided by radiologists. These AI models can provide automated and
standardized breast density measurements, which are not only used to assess the risk of
developing breast cancer but also as predictive surrogate markers for therapy response
in high-risk patients. Further research is necessary to determine the most suitable
assessment tool and how to effectively integrate this information into routine clinical
practice.

Traditional risk prediction models, such as the Tyrer-Cuzick model, also consider

breast density as a part of the risk factors. AI models have been incorporated to enhance
the existing breast cancer prediction models. A recent study by Arasu et al. demonstrated
that multiple AI models outperformed the Breast Cancer Surveillance Consortium (BCSC)
risk model in predicting five-year breast cancer risk, with significantly better
performance (AUC, 0.63

–

0.67 for AI models vs. AUC, 0.61 for BCSC model).

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Special Issue

–

04 (2025) / ISSN 2181-1415

78

Additionally, AI algorithms can not only be trained on human-extracted features

but can also analyze breast parenchymal patterns that may not be discernible to the
human eye. Kim et al. developed a model that utilizes Imaging Biomarkers in MMG, which
are parenchymal patterns observed in high-risk individuals. This model can accurately
predict cancer occurrence, even when trained solely with the unaffected breasts of
patients with cancer. These models enable accurate short- and long-term risk predictions
using MMGs from a single time point. Another example is the ML model called Mirai,
which performed better than previous DL models in identifying both five-year breast
cancer risk and high-risk patients across diverse populations. There is also ProFound AI,
an AI-CAD-based concurrent-read predictive model for DBT cases, which helps reduce

the workload and time required to enhance radiologists’ cancer detection performance.

These models may be able to determine screening methods and frequencies for each
individual.

The potential for the direct prediction of genetic alterations using AI models has

been suggested, akin to the prediction of HER2 FISH status using AI models. The
ShuffleNet-based DL algorithm consistently infers a wide range of genetic mutations,
molecular tumor subtypes, gene expression signatures, and pathology biomarkers from
H&E-stained slides across 14 of the most common solid tumor types, and detects
mutations, such as PIK3CA and MAP2K4 in breast cancer. The ML model could predict
molecular features, including DNA methylation, gene expression, copy number
alterations, and somatic mutations. Additionally, AI models have been developed to
predict germline BRCA mutation status and chromosomal instability status, both of
which have a prognostic value. Several studies have developed AI models to predict ODX
risk scores, offering both prognostic and predictive insights for adjuvant systemic
therapy, which can classify ODX risk categories by quantifying tubule nuclei and mitotic
counts. Similarly, Cho et al. reported that an AI model could classify the ODX risk score
with a cutoff value of 25. The predicted high-risk groups demonstrated significantly
lower survival outcomes in patients with early stage HR-positive breast cancer, further
underscoring the potential of AI for cancer prognostication and management.

AI in predicting clinical outcomes and treatment response
AI has been used to monitor and assess the prognosis of breast cancer. AI

algorithms in conjunction with MRI scans were employed to evaluate the anticipated
response to adjuvant and neoadjuvant treatments based on pretreatment imaging. By
analyzing the imaging features and patterns, AI can assist in predicting treatment
responses and optimizing treatment strategies to improve patient outcomes. A similar
endeavor occurs with ultrasonography, where AI predicts the response to NAC and helps
forecast the overall breast cancer prognosis. Additionally, AI has emerged as a potential
tool for assessing the response to chemotherapy in post-treatment MRIs and predicting
recurrence risk. In the future, AI algorithms could analyze medical images, such as MRIs,
and provide quantitative assessments and predictions that could assist radiologists and
oncologists in their decision-making processes.

Turning the spotlight to pathology, the wealth of information extracted from

pathological slides is a gold mine for predicting treatment responses and broader clinical
outcomes. For example, an AI algorithm proposes a novel recurrence score (RS) with the
potential to serve as a viable alternative to the more expensive 21-gene assays. This
model analyzed different aspects of the cancer and surrounding tissues as well as the

Жамият

ва

инновациялар

–

Общество

и

инновации

–

Society and innovations

Special Issue

–

04 (2025) / ISSN 2181-1415

79

density of TILs and could help predict which high-risk patients would benefit from
adjuvant chemotherapy. This suggests that the RS from the AI model may serve as a
predictive biomarker for adjuvant chemotherapy responses. In a comparative study of
ML models utilizing clinical and pathological data, the random forest model
demonstrated the highest performance, with an AUC of 0.88, for predicting pCR following
NAC in patients with locally advanced or high-risk early breast cancer. Recently, a CNN-
based model trained on H&E-stained WSIs from core biopsies of TNBC patients after NAC
was reported to have a positive predictive value of 73.7% for pCR. Huang et al. developed
an AI-based automatic WSI feature extraction pipeline, named IMPRESS, using WSIs
stained with both H&E and multiplex IHC (PD-L1, CD8+, and CD163+). ML models using
features from IMPRESS and clinical variables accurately predicted the NAC response in
patients with HER2+ or TNBC, surpassing a model trained with manually generated
pathological features, suggesting that it may be a preferred method for developing
algorithms to predict treatment responses in the future. Upon external validation, these
models produced promising results, especially for the HER2+ subtype (AUC = 0.90 for
HER2+, and 0.59 for TNBC). Furthermore, a multi-omics ML model, trained on a
combination of clinical, DNA, RNA, digital pathology, and treatment features, showed an
AUC of 0.87 in predicting pCR following NAC, with or without HER2-targeted therapy.

REFERENCES:

1. Giaquinto AN, Sung H, Miller KD, Kramer JL, Newman LA, Minihan A, et al. Breast

cancer statistics, 2022. CA Cancer J Clin. 2022.

2. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global

cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36
cancers in 185 countries. CA Cancer J Clin. 2021.

3. Taylor C, McGale P, Probert J, Broggio J, Charman J, Darby SC, et al. Breast cancer

mortality in 500 000 women with early invasive breast cancer diagnosed in England,

1993

–

2015: population based observational cohort study. BMJ. 2023.
4. Marmot MG, Altman DG, Cameron DA, Dewar JA, Thompson SG, Wilcox M. The

benefits and harms of breast cancer screening: an independent review. Br J Cancer. 2013.

5. The Royal College of Radiologists. RCR Clinical Radiology Workforce Census

2022. London: The Royal College of Radiologists; 2022.

Библиографические ссылки

Giaquinto AN, Sung H, Miller KD, Kramer JL, Newman LA, Minihan A, et al. Breast cancer statistics, 2022. CA Cancer J Clin. 2022.

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021.

Taylor C, McGale P, Probert J, Broggio J, Charman J, Darby SC, et al. Breast cancer mortality in 500 000 women with early invasive breast cancer diagnosed in England, 1993–2015: population based observational cohort study. BMJ. 2023.

Marmot MG, Altman DG, Cameron DA, Dewar JA, Thompson SG, Wilcox M. The benefits and harms of breast cancer screening: an independent review. Br J Cancer. 2013.

The Royal College of Radiologists. RCR Clinical Radiology Workforce Census 2022. London: The Royal College of Radiologists; 2022.