OPTIMIZING REAL-TIME DYNAMIC PRICING STRATEGIES IN RETAIL AND E-COMMERCE USING MACHINE LEARNING MODELS

Pritom Das; Tamanna Pervin; Biswanath Bhattacharjee; Md Razaul Karim; Nasrin Sultana; Md. Sayham Khan; Md Afjal Hosien; FNU Kamruzzaman

doi:10.37547/tajet/Volume06Issue12-15

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

163

https://www.theamericanjournals.com/index.php/tajet

PUBLISHED DATE: - 25-12-2024

DOI: -

https://doi.org/10.37547/tajet/Volume06Issue12-15

PAGE NO.: - 163-177

OPTIMIZING REAL-TIME DYNAMIC PRICING
STRATEGIES IN RETAIL AND E-COMMERCE
USING MACHINE LEARNING MODELS

Pritom Das

College of Computer Science, Pacific States University, Los Angeles, CA, USA

Tamanna Pervin

Department of Business Administration, International American University,

Los Angeles, California, USA

Biswanath Bhattacharjee

Department of Management Science and Quantitative Methods, Gannon

University, USA

Md Razaul Karim

Department of Information Technology & Computer Science, University of

the Potomac, USA

Nasrin Sultana

Department of Strategic Communication, Gannon University, USA

Md. Sayham Khan

Department of Information Technology & Computer Science, University of
the Potomac, USA

Md Afjal Hosien

School of Information Technology, Washington University of Science &

Technology, USA

FNU Kamruzzaman

Department of Information Technology Project Management & Business

Analytics, St. Francis College, USA

RESEARCH ARTICLE

Open Access

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

164

https://www.theamericanjournals.com/index.php/tajet

INTRODUCTION

Dynamic pricing has become a crucial strategy in
retail and e-commerce, where businesses aim to
optimize prices in real-time to maximize profits,
improve customer satisfaction, and maintain
competitiveness. The ability to adjust prices
dynamically depends on factors such as market
demand, competitor pricing, product availability,
and customer preferences (McKinsey & Company,
2020). Traditional static pricing models often fail
to capture these dynamic interactions, leading to
suboptimal business outcomes (Zhang et al., 2018).

In recent years, machine learning (ML) techniques
have emerged as powerful tools for dynamic
pricing strategies, offering the ability to analyze
large datasets, detect patterns, and make accurate
predictions. Machine learning models such as
Linear Regression, Random Forest, and Gradient
Boosting Machines (GBM) have been increasingly
applied in e-commerce and retail to forecast
optimal pricing strategies (Choi et al., 2019).
However, selecting the most suitable model that

balances computational efficiency and prediction
accuracy remains a challenge.

This study explores the application of supervised
machine learning models

—

Linear Regression,

Random

Forest,

and

Gradient

Boosting

Machines

—

to real-time dynamic pricing strategies

in the retail and e-commerce sectors. The primary
objective is to evaluate the models based on
metrics such as R-squared (R²), Mean Absolute
Error (MAE), and Root Mean Square Error (RMSE)
to determine their effectiveness in real-time
dynamic pricing optimization. The study also
simulates a controlled environment to test these
mod

els’ real

-world applicability, demonstrating

their integration with e-commerce platforms.

The Concept of Dynamic Pricing

Dynamic pricing is the practice of adjusting prices
in real-time based on market conditions, consumer
behavior, and competitive factors (Kannan &
Kopalle, 2001). Dynamic pricing strategies have
been extensively studied in the context of e-

Abstract

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

165

https://www.theamericanjournals.com/index.php/tajet

commerce and retail, with research highlighting its
importance in maximizing revenue and optimizing
customer satisfaction (Gal-Or, 1985). According to
Chen et al. (2001), dynamic pricing algorithms
incorporate demand elasticity, competitor prices,
and inventory levels to adjust prices effectively.
Recent studies have also emphasized the
significance of real-time adjustments based on
machine learning predictions to address demand
fluctuations and competitor actions (Kumar et al.,
2019).

Machine Learning in Dynamic Pricing
Strategies

Machine learning has transformed dynamic
pricing strategies by enabling businesses to
process and analyze large volumes of data
efficiently.

Supervised

learning

models,

particularly regression-based and ensemble
techniques, are often employed to forecast optimal
prices (Waller & Leigh, 2009). Linear Regression
remains one of the foundational techniques due to
its simplicity and interpretability, but it often
struggles to capture non-linear patterns in
complex data (Clements et al., 2004).

Ensemble methods like Random Forest and
Gradient Boosting Machines have proven more
robust in handling non-linear relationships.
Random Forest, a popular ensemble method,
reduces overfitting by aggregating multiple
decision trees (Lemke et al., 2019). Meanwhile,
Gradient Boosting Machines (GBM) offer high
predictive accuracy by iteratively fitting weak
learners (Friedman, 2001).

Research by Zhou et al. (2017) and Choi et al.
(2019) demonstrated that ensemble models
outperform simpler linear approaches in dynamic
pricing applications. Their studies showed that
models like GBM and Random Forest could capture
complex interactions among product features,
demand patterns, and competitor pricing more
effectively.

Challenges in Model Selection for Dynamic Pricing

Despite these advantages, selecting the right
machine learning model for dynamic pricing
remains challenging. Factors such as scalability,
computational efficiency, and interpretability play
a significant role in decision-making (McKinsey &
Company, 2020). Real-time deployment of these
models requires integration with e-commerce
platforms and cloud infrastructure, which adds
complexity to system architecture and data
processing (Yuan et al., 2019).

Furthermore, the performance of machine
learning models can be evaluated through various
metrics. The R-squared (R²) value measures the
proportion of variance explained by the model,
while Mean Absolute Error (MAE) and Root Mean
Square Error (RMSE) are standard error metrics
(Hyndman & Athanasopoulos, 2018). Accurate
assessment of these metrics is essential to
determine the practical viability of machine
learning models in real-world dynamic pricing
environments.

METHODOLOGY

To study the application of machine learning for
real-time dynamic pricing strategies in retail and e-
commerce, we adopted a comprehensive and
systematic approach encompassing dataset
acquisition, preprocessing, exploratory data
analysis, feature engineering, model selection, and
evaluation. Each stage was designed to ensure the
robustness and validity of our results in addressing
the complexities of dynamic pricing.

We utilized a publicly available dataset from
Kaggle,

titled

"Retail

and

E-Commerce

Transactions Dataset," which contains detailed
transactional and product-related data. The
dataset encompasses historical transactions from
multiple e-commerce platforms and retail chains
worldwide, providing a rich resource for analyzing
pricing strategies.

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

166

https://www.theamericanjournals.com/index.php/tajet

This dataset includes 1.5 million rows and 20
features, covering three years of transactional data
(2020

–

2023). Key attributes include product

details, pricing history, customer information,

competitor pricing, inventory levels, and temporal
indicators such as seasonal events. A detailed
summary of the dataset is provided below:

Feature Name

Description

Data Type Example Values

Transaction ID

Unique identifier for each transaction

Categorical T987654

Product ID

Unique identifier for each product

Categorical P876543

Product Name

The name or description of the product

Categorical Wireless

Headphones

Product Category

The category of the product

Categorical Electronics, Apparel

Historical Price

Previous product prices

Numeric

59.99, 89.99

Current Price

Product price at the time of the transaction

Numeric

54.99, 84.99

Competitor Price

Price of a similar product on competing
platforms

Numeric

55.49, 85.99

Inventory Level

Stock availability of the product

Numeric

150, 500

Promotion Status

Indicates if a product is on promotion

Boolean

0, 1

Customer
Demographics

Age, gender, income group of the customer

Categorical Female,

35-44,

$70K+

Customer Region

Geographic region of the customer

Categorical North America, Asia

Transaction Timestamp Timestamp of the transaction

Timestamp 2023-12-15 14:30:00

Purchase Quantity

Number of units purchased

Numeric

1, 3

Total Revenue

Revenue generated from the transaction

Numeric

54.99, 269.97

Discount Applied

Amount or percentage of discount provided

Numeric

5.00, 10%

Competitor Popularity

Average sales ranking of competing products Numeric

1, 2, 3

Seasonal Indicator

Flags seasonal peaks like holidays or special
events

Boolean

0, 1

Price Elasticity

Product demand sensitivity to price changes

Numeric

0.7, 1.2

Customer Loyalty

Flags if the customer is part of a loyalty
program

Boolean

0, 1

Market Segment

Target segment of the product

Categorical Premium, Economy

DATA PREPROCESSING

The data preprocessing phase was a critical
component of our study, as it directly impacted the
accuracy and reliability of the machine learning
models. The dataset, sourced from Kaggle,
comprised raw transactional records, which
required extensive cleaning and transformation to
ensure its suitability for analysis. This section
details the comprehensive steps taken to prepare
the data.

The dataset contained missing values in several
key features, including competitor pricing,
inventory levels, and customer demographics.

These gaps were addressed using context-
appropriate imputation techniques. For numerical
features, such as competitor pricing and inventory
levels, we used median imputation to preserve the
central tendency of the data. In the case of
categorical variables, like customer regions and
product categories, missing values were filled
using mode imputation. For time-related gaps,
particularly in timestamps, interpolation methods
were applied by referencing adjacent records to
ensure continuity and consistency in transaction
timelines.

Duplicate records were identified using unique

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

167

https://www.theamericanjournals.com/index.php/tajet

transaction identifiers and other distinguishing
features. These duplicates were removed to
prevent data redundancy and potential bias in the
model. Invalid entries, such as transactions with
zero or negative purchase quantities and revenues,
were systematically filtered out. This step was
essential to maintain the integrity of the dataset
and eliminate anomalies that could distort
analytical outcomes.

Outliers were a prominent issue in numerical
features like pricing and inventory levels. We
employed the interquartile range (IQR) method to
identify

and

address

these

anomalies.

Observations falling beyond 1.5 times the IQR were
flagged as outliers. Depending on the context,
outliers were either capped and floored to the 1st
and 99th percentiles, respectively, or retained if
deemed contextually valid (e.g., high pricing for
premium products). This process ensured the
data's representativeness without compromising
valuable information.

The dataset included several categorical variables
that required transformation into numerical
formats for machine learning algorithms. For non-
ordinal variables, such as product categories and
customer regions, we applied one-hot encoding to
create binary columns for each unique category.
Ordinal features, such as income groups and age
brackets, were label-encoded to preserve their
inherent order. This ensured that all features were
compatible with the models and accurately
represented their underlying characteristics.

Feature scaling was applied to ensure uniformity
across numerical features, which varied
significantly in range and scale. Z-score
normalization was used for features like revenue
and inventory levels to standardize them around a
mean of zero with a standard deviation of one. Min-
Max scaling was implemented for features such as
promotional impact and competitor price
advantage to transform their values into a range

between 0 and 1. These scaling techniques
prevented

any

single

feature

from

disproportionately influencing the model during
training.

Temporal data embedded in transaction
timestamps provided valuable insights into
shopping behavior and pricing trends. We
extracted features such as hour of the day, day of
the week, and month of the year to capture
temporal patterns. Additionally, binary event flags
were added to identify transactions occurring
during major sales events, such as Black Friday or
Cyber Monday. These features enhanced the

model’s ability to identify seasonality and demand

surges.

Class imbalances were observed in outcomes
related to promotional effectiveness and revenue
distribution. To address this, we employed the
Synthetic Minority Over-sampling Technique
(SMOTE), which generated synthetic samples of
underrepresented classes. This approach ensured
that the model was trained on a balanced dataset,
reducing bias and improving its ability to
generalize across different scenarios.

FEATURE ENGINEERING

To enrich the dataset and capture complex
interactions among variables, we engineered
several new features. Competitor price differences
were calculated as the variance between current
prices and competitors' prices. Revenue per unit
was derived by dividing total revenue by the
purchase quantity. Discount percentages were
computed as the ratio of the discount amount to
the original price. Demand elasticity was estimated
by analyzing the relationship between changes in
price and corresponding variations in purchase
quantity. These features provided the model with
deeper insights into customer behavior and pricing
dynamics. To simulate real-time pricing scenarios,
we augmented the dataset with synthetic
transactions reflecting temporal trends and

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

168

https://www.theamericanjournals.com/index.php/tajet

customer segmentation. By generating additional
data points, we ensured that the model could adapt
to dynamic pricing conditions and anticipate
market fluctuations effectively.

The final dataset was split into training, validation,
and testing sets using an 80:10:10 ratio. Stratified
sampling was applied to maintain the distribution
of critical features, such as product categories and
revenue. Validation of the preprocessed data
included statistical analyses, such as mean and
variance checks, visual inspections using plots and
charts, and correlation analysis to identify
multicollinearity. This step ensured the dataset
was free from inconsistencies and ready for
machine learning model development.

By following this robust preprocessing pipeline,
we transformed the raw dataset into a high-
quality, structured format. This meticulous
preparation was vital to accurately capturing the
complexities of real-time dynamic pricing
strategies

and

ensuring

reliable

model

performance.

MODEL DEVELOPMENT

For the development of dynamic pricing models,
we employed several supervised machine learning
algorithms, each tailored to capture the
complexities of the pricing environment. These
included Linear Regression, Random Forest, and
Gradient Boosting Machines (GBM). Each model
was selected based on its ability to handle various
types of data and capture non-linear relationships,
which are critical in real-time dynamic pricing
scenarios.

The dataset was split into training and testing
subsets using a 70:30 ratio. This ensured that the
models were trained on a substantial portion of the
data while reserving an independent set for
evaluation. Stratified sampling was applied to
maintain the balance of key features across the
splits.

Model Selection

1.

Linear Regression was chosen as a baseline

model due to its simplicity and interpretability. It
allowed us to establish a reference point for more
complex algorithms.

2.

Random Forest was selected for its ability to

handle high-dimensional data and capture non-
linear interactions between features. Its ensemble
nature made it robust to overfitting.

3.

Gradient Boosting Machines (GBM) were

implemented for their capacity to optimize
predictive performance through sequential
learning, leveraging weak learners to form a strong
predictive model.

Hyperparameter Tuning

Hyperparameter optimization was critical for
achieving the best performance from each model.
A grid search strategy was employed in
conjunction with k-fold cross-validation to
systematically

explore

combinations

of

hyperparameters. Key hyperparameters tuned
included:

•

For Linear Regression: Regularization
parameters (e.g., L1/L2 penalties).

•

For Random Forest: Number of trees,
maximum depth, and minimum samples per
leaf.

•

For GBM: Learning rate, number of boosting
iterations, and maximum depth of individual
learners.

Cross-Validation

We used 5-fold cross-validation to ensure that the
model's performance was robust across different
subsets of the data. This iterative training and
validation approach minimized the risk of
overfitting and provided a more reliable estimate
of

model

generalization.To

ensure

the

interpretability of the models, we conducted a

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

169

https://www.theamericanjournals.com/index.php/tajet

feature importance analysis. For Random Forest
and GBM, feature importance scores were derived
based on the contribution of each feature to the
predictive performance. This analysis revealed
that features like competitor price differences,
promotional impact, and revenue per unit were
among the most significant predictors.The models
were implemented in Python using libraries such
as scikit-learn for algorithm development and
pandas for data manipulation. TensorFlow and
XGBoost were explored for further refinement and
scalability of the boosting algorithms.

Model Evaluation

Model evaluation was performed to assess the
predictive accuracy, reliability, and real-time
applicability of the pricing models. The evaluation
process was divided into two main stages:
standard performance metrics and dynamic
pricing simulations.

Standard Performance Metrics

To compare the models effectively, we utilized a
range of evaluation metrics:

•

Mean Absolute Error (MAE): Measured the
average magnitude of errors between
predicted and actual prices, offering a clear

sense of prediction accuracy.

•

Root Mean Square Error (RMSE): Penalized
larger errors more heavily, providing
insight into model robustness against
significant deviations.

•

R-Squared (R²): Assessed the proportion of
variance in the target variable explained by
the model, serving as a measure of
goodness-of-fit.

RESULT

The results of our study demonstrate the
effectiveness of machine learning models in
predicting optimal prices for real-time dynamic
pricing strategies. By leveraging three different
algorithms

—

Linear Regression, Random Forest,

and Gradient Boosting Machines (GBM)

—

we were

able to analyze their performance across several
metrics and evaluate their suitability for the
dynamic nature of retail and e-commerce pricing.

Performance Metrics Overview

The evaluation of the models focused on three key
metrics: Mean Absolute Error (MAE), Root Mean
Square Error (RMSE), and R-Squared (R²). The
results on the test dataset are presented in the
table 1 below:

Table 1: Model Evaluation

Model

MAE RMSE R²

Linear Regression 2.78

3.45

0.81

Random Forest

1.89

2.12

0.92

Gradient Boosting 1.73

2.01

0.94

Linear Regression: This model provided a baseline
for performance evaluation. It achieved an MAE of
2.78, RMSE of 3.45, and an R² of 0.81, indicating
moderate accuracy. However, its inability to
capture non-linear relationships limited its
effectiveness, especially in scenarios involving
complex pricing dependencies.

Random Forest: With an MAE of 1.89 and an RMSE

of 2.12, Random Forest demonstrated significant
improvement over Linear Regression. Its ensemble
learning approach allowed it to capture complex
interactions between features, resulting in a robust
and reliable performance with an R² value of 0.92.

Gradient Boosting Machines (GBM): GBM
outperformed the other models across all metrics.
It achieved the lowest MAE of 1.73 and RMSE of

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

170

https://www.theamericanjournals.com/index.php/tajet

2.01, alongside the highest R² value of 0.94. The
sequential learning nature of GBM allowed it to
minimize errors iteratively, making it particularly
well-suited for dynamic pricing scenarios.

COMPARATIVE ANALYSIS

To better understand the models' relative
performance, a comparative study was conducted:

•

Prediction Accuracy: GBM consistently

produced predictions closest to actual prices,
evidenced by its lower error rates. Random Forest
followed closely, while Linear Regression lagged
behind, particularly in non-linear scenarios.

•

Robustness to Variability: Random Forest

and GBM exhibited strong adaptability to varying
data conditions, such as fluctuating competitor
prices and seasonal demand. Linear Regression
struggled to account for these complexities.

•

Computational Efficiency: While GBM

provided the best performance, it required more
computational resources and longer training times
compared to Random Forest and Linear
Regression. This trade-off may influence model
selection depending on the deployment
environment.

REAL-TIME SIMULATION RESULTS

To validate the models under realistic conditions,
we conducted real-time simulations using test
scenarios that mimicked dynamic market
environments. These scenarios included changes
in competitor pricing, promotional campaigns, and
demand surges. The results were evaluated based
on the following criteria:

•

Revenue Optimization: GBM consistently
optimized revenue more effectively,
adjusting prices dynamically to maximize

profitability without sacrificing demand.

•

Customer Retention: Random Forest and
GBM both demonstrated an ability to
balance price adjustments with customer
satisfaction, retaining high engagement

rates. Linear Regression’s performance in

this area was less effective due to its
simplistic pricing predictions.

Insights and Key Findings

1.

GBM as the Best Performer: The results
clearly indicate that GBM is the most
suitable model for real-time dynamic
pricing. Its ability to handle non-linear
relationships, feature interactions, and
sequential learning allowed it to deliver
superior results.

2.

Random Forest as a Close Alternative: While
not as precise as GBM, Random Forest offers
a robust and computationally efficient
alternative, making it a viable choice in
environments with limited computational
resources.

3.

Limitations of Linear Regression: Linear
Regression is best used as a baseline model
or in simpler pricing scenarios. Its
performance was notably weaker in
dynamic and complex environments.

Visualization of Results

To illustrate the models' performance, we plotted
predicted prices against actual prices for each
model. GBM displayed the tightest fit, closely
aligning with actual values, while Linear
Regression showed greater variance. Random

Forest’s predictions also aligned closely, but with

slightly more variability compared to GBM.

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

171

https://www.theamericanjournals.com/index.php/tajet

Chart 1: Mean Absolute Error (MAE) and Root Mean Square Error (RMSE

This bar chart 1 presents a comparative view of
two important error metrics

—

Mean Absolute

Error (MAE) and Root Mean Square Error
(RMSE)

—

across three machine learning models:

Linear Regression, Random Forest, and Gradient
Boosting Machines (GBM).

•

The Gradient Boosting Machine (GBM)
consistently outperformed other models
across both MAE and RMSE, highlighting its
ability to accurately capture complex
patterns in dynamic pricing scenarios.

•

The Random Forest model also showed
good results and could be considered a
strong candidate if computational efficiency
is a priority.

•

Linear Regression, while computationally
efficient, demonstrated higher errors in
both MAE and RMSE, suggesting its
limitations in complex retail and e-
commerce pricing dynamics.

By selecting models with low MAE and RMSE
values, businesses can optimize pricing decisions,
maximize profit margins, and remain competitive
in the fast-paced retail and e-commerce landscape.

By understanding and analyzing these R² values,
retailers and e-commerce managers can make
informed decisions about model selection,
infrastructure investments, and scalability
considerations for their dynamic pricing strategies

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

172

https://www.theamericanjournals.com/index.php/tajet

Chart 2: R² (R-squared) value

The R² (R-squared) value chart 2, also known as
the coefficient of determination, is a critical metric
in evaluating machine learning models. It
measures the proportion of the variance in the
target variable that can be predicted by the model.
An R² value of 1 indicates a perfect fit, meaning that
the model explains all the variability in the target
data. Conversely, an R² value close to 0 suggests
that the model fails to capture much of the data's
variability.

The results underscore the importance of selecting
models that can adapt to the complexities of
dynamic pricing in real-time. While GBM
performed best in this study, future work could
explore deep learning models like LSTMs or
Transformer-based architectures to capture
temporal and sequential patterns in pricing data.

DISCUSSION AND CONCLUSION

In this study, we have explored the application of
supervised machine learning models

—

Linear

Regression, Random Forest, and Gradient Boosting
Machines (GBM)

—

for real-time dynamic pricing in

retail and e-commerce. Our goal was to determine
the effectiveness of these models in forecasting
optimal prices by assessing their performance
using key metrics such as Mean Absolute Error

(MAE), Root Mean Square Error (RMSE), and R-
squared (R²). The comparative analysis of these
models allowed us to draw meaningful insights
into their strengths and limitations, providing
practical recommendations for businesses aiming
to optimize their pricing strategies.

The results indicate that Gradient Boosting
Machines (GBM) consistently outperformed the
other models across all performance metrics. GBM
achieved

the

lowest

MAE

and

RMSE,

demonstrating superior predictive accuracy and
stability. This suggests that GBM is highly effective
in capturing the complex interactions among
various factors that influence dynamic pricing,
such as demand fluctuations, competitor prices,
and product availability. Businesses can rely on
GBM for more robust and accurate pricing
decisions, which are critical in maintaining a
competitive edge in fast-paced retail and e-
commerce environments.

While the Random Forest model also delivered
good results, it was slightly less accurate than GBM
but still provided satisfactory predictions with a
balanced trade-off between accuracy and
computational efficiency. In many real-world
applications, Random Forest remains a viable
choice due to its scalability and reduced

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

173

https://www.theamericanjournals.com/index.php/tajet

susceptibility to overfitting. On the other hand, the
Linear Regression model, despite its simplicity and
interpretability, showed higher error rates in both
MAE and RMSE, which indicates its limitations in
addressing the non-linear relationships present in
dynamic pricing data.

Another key point is the importance of
hyperparameter tuning and cross-validation in
enhancing the performance of machine learning
models. Our use of grid search and cross-validation
techniques ensured that each model was properly
optimized and tested, which helped us achieve
reliable and accurate predictions. This reinforces
the necessity of rigorous model training and
evaluation

processes

to

ensure

optimal

performance in dynamic pricing applications. The
real-time testing in a simulated environment
further highlighted the practical feasibility of our
methodology. The integration with e-commerce
platforms demonstrated that our models could
make quick adjustments to pricing based on real-
world conditions, ensuring responsiveness to
demand changes and competitor actions. This
adaptability is crucial in a competitive market
where businesses must react swiftly to maintain
profitability and customer satisfaction.

However, it is important to acknowledge the
limitations of our study. The dataset obtained from
Kaggle provided a solid foundation for our
analysis, but it may not fully capture all the unique
challenges and complexities present in specific
retail and e-commerce markets. Factors such as
brand

loyalty,

seasonality,

and

regional

preferences may influence pricing decisions but
were not fully represented in our dataset. Future
research should focus on incorporating more
diverse datasets and real-world data from live
retail and e-commerce environments to provide a
more comprehensive evaluation of machine
learning models for dynamic pricing. Additionally,
computational efficiency and scalability remain

critical considerations for real-world deployment.
While GBM delivered the best accuracy, it is
computationally intensive and may require
significant processing power in large-scale
applications. Organizations must weigh the trade-
offs

between

predictive

accuracy

and

computational cost when selecting a model for
implementation.

In

conclusion,

this

study

successfully

demonstrated the effectiveness of machine
learning models for real-time dynamic pricing
strategies in the retail and e-commerce sectors.
Our comparative analysis of Linear Regression,
Random Forest, and Gradient Boosting Machines
(GBM) highlighted that GBM consistently delivered
superior performance in terms of prediction
accuracy and stability. The use of metrics such as
Mean Absolute Error (MAE), Root Mean Square
Error (RMSE), and R-squared (R²) provided a

comprehensive evaluation of each model’s

predictive performance. We have shown that
machine learning models can effectively capture
complex interactions in dynamic pricing data,
allowing businesses to optimize pricing strategies
in real-time. The results emphasize the necessity of
proper hyperparameter tuning, cross-validation,
and integration with e-commerce infrastructure to
ensure real-world applicability. Businesses can
leverage these insights to make informed decisions
about pricing strategies, ensuring higher
profitability, better customer engagement, and
sustained competitiveness in the market.

Although our research relied on a Kaggle dataset
and simulated environments, it lays the
groundwork for future investigations into more
intricate and real-world scenarios. Expanding the
scope of datasets and including factors such as
seasonality, regional preferences, and consumer
behavior would provide more robust insights and
actionable strategies for dynamic pricing in
specific markets.As machine learning continues to

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

174

https://www.theamericanjournals.com/index.php/tajet

evolve, businesses in retail and e-commerce must
stay informed about technological advancements
and continuously refine their models and pricing
algorithms. Embracing ensemble models like
Gradient Boosting Machines and considering
trade-offs in computational efficiency will be
essential in staying ahead of competitors and
responding swiftly to market changes.Ultimately,
adopting machine learning-driven dynamic pricing
models enables businesses to optimize their
operations, improve customer satisfaction, and
maximize profitability. By investing in research,
technology,

and strategic implementation,

companies can harness the full potential of
machine learning to drive smarter, data-driven
pricing decisions, ensuring long-term success in
the highly competitive retail and e-commerce
landscape

ACKNOWLEDGEMENT:

All the author contributed

equally

REFERENCE

1.

Md Habibur Rahman, Ashim Chandra Das,
Md Shujan Shak, Md Kafil Uddin, Md
Imdadul Alam, Nafis Anjum, Md Nad Vi Al
Bony,

&

Murshida

Alam.

(2024).

TRANSFORMING CUSTOMER RETENTION
IN

FINTECH

INDUSTRY

THROUGH

PREDICTIVE ANALYTICS AND MACHINE
LEARNING. The American Journal of
Engineering and Technology, 6(10), 150

–

163.
https://doi.org/10.37547/tajet/Volume06I
ssue10-17

2.

Chen, Y., Donaldson, J., & McMillan, M. S.
(2001). Market Dynamics and Economic
Impacts of Real-Time Pricing in Competitive
Markets. Economics Review, 58(4), 567-
590.

3.

Clements, M. P., Harris, M. N., & Szafarz, A.
(2004). Econometric Modeling of Dynamic

Pricing Systems. Journal of Economic
Surveys, 18(5), 715-750.

4.

Friedman, J. H. (2001). Greedy Function
Approximation: A Gradient Boosting
Machine. The Annals of Statistics, 29(5),
1189-1232.

5.

Gal-Or, E. (1985). Strategic Pricing of New
Products in Markets with Network
Externalities.

Quarterly

Journal

of

Economics, 100(2), 295-308.

6.

Hyndman, R. J., & Athanasopoulos, G. (2018).
Forecasting: Principles and Practice. OTexts.

7.

Kannan, P. K., & Kopalle, P. K. (2001).
Dynamic

Pricing

on

the

Internet:

Optimization and Consumer Behavior.
Marketing Science, 20(1), 42-61.

8.

Kumar, A., Gupta, S., & Mehta, R. (2019).
Real-Time Dynamic Pricing Strategies in
Retail and E-commerce. International
Journal of Machine Learning Research,
20(6), 456-473.

9.

Lemke, A., Grinblatt, M., & Kannan, S. (2019).
Ensemble Models for Competitive Market
Pricing. Computational Economics, 30(4),
678-702.

10.

McKinsey & Company. (2020). Artificial
Intelligence in E-commerce and Retail
Pricing Strategies.

11.

Waller, D. S., & Leigh, T. W. (2009). Social
Psychological

Perspectives

on

Price

Magnitude Effects. Advances in Consumer
Research, 36(1), 494-500.

12.

Yuan, Y., Zhang, T., & Huang, W. (2019).
Cloud Integration for Scalable Dynamic
Pricing Applications. IEEE Transactions on
Cloud Computing, 7(4), 785-798.

13.

Zhou, Y., Zhao, Y., & Zhang, X. (2017).
Comparative Study of Machine Learning

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

175

https://www.theamericanjournals.com/index.php/tajet

Models for Demand Forecasting in Retail.
Journal of Business Analytics, 12(3), 233-
250.

14.

Md Habibur Rahman, Ashim Chandra Das,
Md Shujan Shak, Md Kafil Uddin, Md
Imdadul Alam, Nafis Anjum, Md Nad Vi Al
Bony,

&

Murshida

Alam.

(2024).

TRANSFORMING CUSTOMER RETENTION
IN

FINTECH

INDUSTRY

THROUGH

PREDICTIVE ANALYTICS AND MACHINE
LEARNING. The American Journal of
Engineering and Technology, 6(10), 150

–

163.
https://doi.org/10.37547/tajet/Volume06I
ssue10-17

15.

Tauhedur Rahman, Md Kafil Uddin,
Biswanath

Bhattacharjee,

Md

Siam

Taluckder, Sanjida Nowshin Mou, Pinky
Akter, Md Shakhaowat Hossain, Md Rashel
Miah, & Md Mohibur Rahman. (2024).
BLOCKCHAIN APPLICATIONS IN BUSINESS
OPERATIONS

AND

SUPPLY

CHAIN

MANAGEMENT BY MACHINE LEARNING.
International Journal of Computer Science &
Information

System,

9(11),

17

–

30.

https://doi.org/10.55640/ijcsis/Volume09
Issue11-03

16.

Md Jamil Ahmmed, Md Mohibur Rahman,
Ashim Chandra Das, Pritom Das, Tamanna
Pervin, Sadia Afrin, Sanjida Akter Tisha, Md
Mehedi Hassan, & Nabila Rahman. (2024).
COMPARATIVE ANALYSIS OF MACHINE
LEARNING ALGORITHMS FOR BANKING
FRAUD DETECTION: A STUDY ON
PERFORMANCE, PRECISION, AND REAL-
TIME APPLICATION. International Journal
of Computer Science & Information System,
9(11),

31

–

44.

https://doi.org/10.55640/ijcsis/Volume09
Issue11-04

17.

Bhandari, A., Cherukuri, A. K., & Kamalov, F.

(2023). Machine learning and blockchain
integration for security applications. In Big
Data Analytics and Intelligent Systems for
Cyber Threat Intelligence (pp. 129-173).
River Publishers.

18.

Diro, A., Chilamkurti, N., Nguyen, V. D., &
Heyne, W. (2021). A comprehensive study of
anomaly detection schemes in IoT networks
using machine learning algorithms. Sensors,
21(24), 8320.

19.

Nafis Anjum, Md Nad Vi Al Bony, Murshida
Alam, Mehedi Hasan, Salma Akter, Zannatun
Ferdus, Md Sayem Ul Haque, Radha Das, &
Sadia Sultana. (2024). COMPARATIVE
ANALYSIS OF SENTIMENT ANALYSIS
MODELS ON BANKING INVESTMENT
IMPACT

BY

MACHINE

LEARNING

ALGORITHM. International Journal of
Computer Science & Information System,
9(11),

5

–

16.

https://doi.org/10.55640/ijcsis/Volume09
Issue11-02

20.

Shahbazi, Z., & Byun, Y. C. (2021).
Integration of blockchain, IoT and machine
learning for multistage quality control and
enhancing security in smart manufacturing.
Sensors, 21(4), 1467.

21.

Das, A. C., Mozumder, M. S. A., Hasan, M. A.,
Bhuiyan, M., Islam, M. R., Hossain, M. N., ... &
Alam, M. I. (2024). MACHINE LEARNING
APPROACHES

FOR

DEMAND

FORECASTING: THE IMPACT OF CUSTOMER
SATISFACTION

ON

PREDICTION

ACCURACY. The American Journal of
Engineering and Technology, 6(10), 42-53.

22.

Al Mamun, A., Hossain, M. S., Rishad, S. S. I.,
Rahman, M. M., Tisha, S. A., Shakil, F., ... &
Sultana, S. (2024). MACHINE LEARNING
FOR

STOCK

MARKET

SECURITY

MEASUREMENT:

A

COMPARATIVE

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

176

https://www.theamericanjournals.com/index.php/tajet

ANALYSIS

OF

SUPERVISED,

UNSUPERVISED, AND DEEP LEARNING
MODELS. International journal of networks
and security, 4(01), 22-32.

23.

Akter, S., Mahmud, F., Rahman, T., Ahmmed,
M. J., Uddin, M. K., Alam, M. I., ... & Jui, A. H.
(2024). A COMPREHENSIVE STUDY OF
MACHINE LEARNING APPROACHES FOR
CUSTOMER SENTIMENT ANALYSIS IN
BANKING SECTOR. The American Journal of
Engineering and Technology, 6(10), 100-
111.

24.

Shahid, R., Mozumder, M. A. S., Sweet, M. M.
R., Hasan, M., Alam, M., Rahman, M. A., ... &
Islam, M. R. (2024). Predicting Customer
Loyalty in the Airline Industry: A Machine
Learning Approach Integrating Sentiment
Analysis and User Experience. International
Journal on Computational Engineering, 1(2),
50-54.

25.

Md Risalat Hossain Ontor, Asif Iqbal, Emon
Ahmed, Tanvirahmedshuvo, & Ashequr
Rahman. (2024). LEVERAGING DIGITAL
TRANSFORMATION AND SOCIAL MEDIA
ANALYTICS FOR OPTIMIZING US FASHION

BRANDS’ PERFORMANCE: A MACHINE

LEARNING

APPROACH.

International

Journal of Computer Science & Information
System,

9(11),

45

–

56.

https://doi.org/10.55640/ijcsis/Volume09
Issue11-05

26.

Rahman, A., Iqbal, A., Ahmed, E., & Ontor, M.
R. H. (2024). PRIVACY-PRESERVING
MACHINE

LEARNING:

TECHNIQUES,

CHALLENGES, AND FUTURE DIRECTIONS
IN SAFEGUARDING PERSONAL DATA
MANAGEMENT. International journal of
business and management sciences, 4(12),
18-32.

27.

COMPARATIVE PERFORMANCE ANALYSIS

OF MACHINE LEARNING ALGORITHMS FOR
BUSINESS INTELLIGENCE: A STUDY ON
CLASSIFICATION

AND

REGRESSION

MODELS. (2024). International Journal of
Business and Management Sciences, 4(11),
06-18. https://doi.org/10.55640/ijbms-04-
11-02

28.

Naznin, R., Sarkar, M. A. I., Asaduzzaman, M.,
Akter, S., Mou, S. N., Miah, M. R., ... & Sajal, A.
(2024). ENHANCING SMALL BUSINESS
MANAGEMENT

THROUGH

MACHINE

LEARNING: A COMPARATIVE STUDY OF
PREDICTIVE MODELS FOR CUSTOMER
RETENTION, FINANCIAL FORECASTING,
AND

INVENTORY

OPTIMIZATION.

International Interdisciplinary Business
Economics Advancement Journal, 5(11), 21-
32.

29.

Md Jamil Ahmmed, Md Mohibur Rahman,
Ashim Chandra Das, Pritom Das, Tamanna
Pervin, Sadia Afrin, Sanjida Akter Tisha, Md
Mehedi Hassan, & Nabila Rahman. (2024).
COMPARATIVE ANALYSIS OF MACHINE
LEARNING ALGORITHMS FOR BANKING
FRAUD DETECTION: A STUDY ON
PERFORMANCE, PRECISION, AND REAL-
TIME APPLICATION. International Journal
of Computer Science & Information System,
9(11),

31

–

44.

https://doi.org/10.55640/ijcsis/Volume09
Issue11-04

30.

Arif, M., Ahmed, M. P., Al Mamun, A., Uddin,
M. K., Mahmud, F., Rahman, T., ... & Helal, M.
(2024). DYNAMIC PRICING IN FINANCIAL
TECHNOLOGY: EVALUATING MACHINE
LEARNING SOLUTIONS FOR MARKET
ADAPTABILITY.

International

Interdisciplinary

Business

Economics

Advancement Journal, 5(10), 13-27.

31.

Iqbal, A., Ahmed, E., Rahman, A., & Ontor, M.
R. H. (2024). ENHANCING FRAUD

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE12

177

https://www.theamericanjournals.com/index.php/tajet

DETECTION AND ANOMALY DETECTION IN
RETAIL BANKING USING GENERATIVE AI
AND MACHINE LEARNING MODELS.
International journal of networks and
security, 4(01), 33-43.

32.

Rahman, M. M., Akhi, S. S., Hossain, S., Ayub,
M. I., Siddique, M. T., Nath, A., ... & Hassan, M.
M.

(2024).

EVALUATING

MACHINE

LEARNING

MODELS

FOR

OPTIMAL

CUSTOMER SEGMENTATION IN BANKING:
A COMPARATIVE STUDY. The American

Journal of Engineering and Technology,
6(12), 68-83.

33.

Bhattacharjee, B., Mou, S. N., Hossain, M. S.,
Rahman, M. K., Hassan, M. M., Rahman, N., ...
& Haque, M. S. U. (2024). MACHINE
LEARNING FOR COST ESTIMATION AND
FORECASTING

IN

BANKING:

A

COMPARATIVE

ANALYSIS

OF

ALGORITHMS. International journal of
business and management sciences, 4(12),
6-17.

OPTIMIZING REAL-TIME DYNAMIC PRICING STRATEGIES IN RETAIL AND E-COMMERCE USING MACHINE LEARNING MODELS

Downloads

Keywords:

Abstract

References