THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
163
https://www.theamericanjournals.com/index.php/tajet
PUBLISHED DATE: - 25-12-2024
https://doi.org/10.37547/tajet/Volume06Issue12-15
PAGE NO.: - 163-177
OPTIMIZING REAL-TIME DYNAMIC PRICING
STRATEGIES IN RETAIL AND E-COMMERCE
USING MACHINE LEARNING MODELS
Pritom Das
College of Computer Science, Pacific States University, Los Angeles, CA, USA
Tamanna Pervin
Department of Business Administration, International American University,
Los Angeles, California, USA
Biswanath Bhattacharjee
Department of Management Science and Quantitative Methods, Gannon
University, USA
Md Razaul Karim
Department of Information Technology & Computer Science, University of
the Potomac, USA
Nasrin Sultana
Department of Strategic Communication, Gannon University, USA
Md. Sayham Khan
Department of Information Technology & Computer Science, University of
the Potomac, USA
Md Afjal Hosien
School of Information Technology, Washington University of Science &
Technology, USA
FNU Kamruzzaman
Department of Information Technology Project Management & Business
Analytics, St. Francis College, USA
RESEARCH ARTICLE
Open Access
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
164
https://www.theamericanjournals.com/index.php/tajet
INTRODUCTION
Dynamic pricing has become a crucial strategy in
retail and e-commerce, where businesses aim to
optimize prices in real-time to maximize profits,
improve customer satisfaction, and maintain
competitiveness. The ability to adjust prices
dynamically depends on factors such as market
demand, competitor pricing, product availability,
and customer preferences (McKinsey & Company,
2020). Traditional static pricing models often fail
to capture these dynamic interactions, leading to
suboptimal business outcomes (Zhang et al., 2018).
In recent years, machine learning (ML) techniques
have emerged as powerful tools for dynamic
pricing strategies, offering the ability to analyze
large datasets, detect patterns, and make accurate
predictions. Machine learning models such as
Linear Regression, Random Forest, and Gradient
Boosting Machines (GBM) have been increasingly
applied in e-commerce and retail to forecast
optimal pricing strategies (Choi et al., 2019).
However, selecting the most suitable model that
balances computational efficiency and prediction
accuracy remains a challenge.
This study explores the application of supervised
machine learning models
—
Linear Regression,
Random
Forest,
and
Gradient
Boosting
Machines
—
to real-time dynamic pricing strategies
in the retail and e-commerce sectors. The primary
objective is to evaluate the models based on
metrics such as R-squared (R²), Mean Absolute
Error (MAE), and Root Mean Square Error (RMSE)
to determine their effectiveness in real-time
dynamic pricing optimization. The study also
simulates a controlled environment to test these
mod
els’ real
-world applicability, demonstrating
their integration with e-commerce platforms.
The Concept of Dynamic Pricing
Dynamic pricing is the practice of adjusting prices
in real-time based on market conditions, consumer
behavior, and competitive factors (Kannan &
Kopalle, 2001). Dynamic pricing strategies have
been extensively studied in the context of e-
Abstract
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
165
https://www.theamericanjournals.com/index.php/tajet
commerce and retail, with research highlighting its
importance in maximizing revenue and optimizing
customer satisfaction (Gal-Or, 1985). According to
Chen et al. (2001), dynamic pricing algorithms
incorporate demand elasticity, competitor prices,
and inventory levels to adjust prices effectively.
Recent studies have also emphasized the
significance of real-time adjustments based on
machine learning predictions to address demand
fluctuations and competitor actions (Kumar et al.,
2019).
Machine Learning in Dynamic Pricing
Strategies
Machine learning has transformed dynamic
pricing strategies by enabling businesses to
process and analyze large volumes of data
efficiently.
Supervised
learning
models,
particularly regression-based and ensemble
techniques, are often employed to forecast optimal
prices (Waller & Leigh, 2009). Linear Regression
remains one of the foundational techniques due to
its simplicity and interpretability, but it often
struggles to capture non-linear patterns in
complex data (Clements et al., 2004).
Ensemble methods like Random Forest and
Gradient Boosting Machines have proven more
robust in handling non-linear relationships.
Random Forest, a popular ensemble method,
reduces overfitting by aggregating multiple
decision trees (Lemke et al., 2019). Meanwhile,
Gradient Boosting Machines (GBM) offer high
predictive accuracy by iteratively fitting weak
learners (Friedman, 2001).
Research by Zhou et al. (2017) and Choi et al.
(2019) demonstrated that ensemble models
outperform simpler linear approaches in dynamic
pricing applications. Their studies showed that
models like GBM and Random Forest could capture
complex interactions among product features,
demand patterns, and competitor pricing more
effectively.
Challenges in Model Selection for Dynamic Pricing
Despite these advantages, selecting the right
machine learning model for dynamic pricing
remains challenging. Factors such as scalability,
computational efficiency, and interpretability play
a significant role in decision-making (McKinsey &
Company, 2020). Real-time deployment of these
models requires integration with e-commerce
platforms and cloud infrastructure, which adds
complexity to system architecture and data
processing (Yuan et al., 2019).
Furthermore, the performance of machine
learning models can be evaluated through various
metrics. The R-squared (R²) value measures the
proportion of variance explained by the model,
while Mean Absolute Error (MAE) and Root Mean
Square Error (RMSE) are standard error metrics
(Hyndman & Athanasopoulos, 2018). Accurate
assessment of these metrics is essential to
determine the practical viability of machine
learning models in real-world dynamic pricing
environments.
METHODOLOGY
To study the application of machine learning for
real-time dynamic pricing strategies in retail and e-
commerce, we adopted a comprehensive and
systematic approach encompassing dataset
acquisition, preprocessing, exploratory data
analysis, feature engineering, model selection, and
evaluation. Each stage was designed to ensure the
robustness and validity of our results in addressing
the complexities of dynamic pricing.
We utilized a publicly available dataset from
Kaggle,
titled
"Retail
and
E-Commerce
Transactions Dataset," which contains detailed
transactional and product-related data. The
dataset encompasses historical transactions from
multiple e-commerce platforms and retail chains
worldwide, providing a rich resource for analyzing
pricing strategies.
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
166
https://www.theamericanjournals.com/index.php/tajet
This dataset includes 1.5 million rows and 20
features, covering three years of transactional data
(2020
–
2023). Key attributes include product
details, pricing history, customer information,
competitor pricing, inventory levels, and temporal
indicators such as seasonal events. A detailed
summary of the dataset is provided below:
Feature Name
Description
Data Type Example Values
Transaction ID
Unique identifier for each transaction
Categorical T987654
Product ID
Unique identifier for each product
Categorical P876543
Product Name
The name or description of the product
Categorical Wireless
Headphones
Product Category
The category of the product
Categorical Electronics, Apparel
Historical Price
Previous product prices
Numeric
59.99, 89.99
Current Price
Product price at the time of the transaction
Numeric
54.99, 84.99
Competitor Price
Price of a similar product on competing
platforms
Numeric
55.49, 85.99
Inventory Level
Stock availability of the product
Numeric
150, 500
Promotion Status
Indicates if a product is on promotion
Boolean
0, 1
Customer
Demographics
Age, gender, income group of the customer
Categorical Female,
35-44,
$70K+
Customer Region
Geographic region of the customer
Categorical North America, Asia
Transaction Timestamp Timestamp of the transaction
Timestamp 2023-12-15 14:30:00
Purchase Quantity
Number of units purchased
Numeric
1, 3
Total Revenue
Revenue generated from the transaction
Numeric
54.99, 269.97
Discount Applied
Amount or percentage of discount provided
Numeric
5.00, 10%
Competitor Popularity
Average sales ranking of competing products Numeric
1, 2, 3
Seasonal Indicator
Flags seasonal peaks like holidays or special
events
Boolean
0, 1
Price Elasticity
Product demand sensitivity to price changes
Numeric
0.7, 1.2
Customer Loyalty
Flags if the customer is part of a loyalty
program
Boolean
0, 1
Market Segment
Target segment of the product
Categorical Premium, Economy
DATA PREPROCESSING
The data preprocessing phase was a critical
component of our study, as it directly impacted the
accuracy and reliability of the machine learning
models. The dataset, sourced from Kaggle,
comprised raw transactional records, which
required extensive cleaning and transformation to
ensure its suitability for analysis. This section
details the comprehensive steps taken to prepare
the data.
The dataset contained missing values in several
key features, including competitor pricing,
inventory levels, and customer demographics.
These gaps were addressed using context-
appropriate imputation techniques. For numerical
features, such as competitor pricing and inventory
levels, we used median imputation to preserve the
central tendency of the data. In the case of
categorical variables, like customer regions and
product categories, missing values were filled
using mode imputation. For time-related gaps,
particularly in timestamps, interpolation methods
were applied by referencing adjacent records to
ensure continuity and consistency in transaction
timelines.
Duplicate records were identified using unique
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
167
https://www.theamericanjournals.com/index.php/tajet
transaction identifiers and other distinguishing
features. These duplicates were removed to
prevent data redundancy and potential bias in the
model. Invalid entries, such as transactions with
zero or negative purchase quantities and revenues,
were systematically filtered out. This step was
essential to maintain the integrity of the dataset
and eliminate anomalies that could distort
analytical outcomes.
Outliers were a prominent issue in numerical
features like pricing and inventory levels. We
employed the interquartile range (IQR) method to
identify
and
address
these
anomalies.
Observations falling beyond 1.5 times the IQR were
flagged as outliers. Depending on the context,
outliers were either capped and floored to the 1st
and 99th percentiles, respectively, or retained if
deemed contextually valid (e.g., high pricing for
premium products). This process ensured the
data's representativeness without compromising
valuable information.
The dataset included several categorical variables
that required transformation into numerical
formats for machine learning algorithms. For non-
ordinal variables, such as product categories and
customer regions, we applied one-hot encoding to
create binary columns for each unique category.
Ordinal features, such as income groups and age
brackets, were label-encoded to preserve their
inherent order. This ensured that all features were
compatible with the models and accurately
represented their underlying characteristics.
Feature scaling was applied to ensure uniformity
across numerical features, which varied
significantly in range and scale. Z-score
normalization was used for features like revenue
and inventory levels to standardize them around a
mean of zero with a standard deviation of one. Min-
Max scaling was implemented for features such as
promotional impact and competitor price
advantage to transform their values into a range
between 0 and 1. These scaling techniques
prevented
any
single
feature
from
disproportionately influencing the model during
training.
Temporal data embedded in transaction
timestamps provided valuable insights into
shopping behavior and pricing trends. We
extracted features such as hour of the day, day of
the week, and month of the year to capture
temporal patterns. Additionally, binary event flags
were added to identify transactions occurring
during major sales events, such as Black Friday or
Cyber Monday. These features enhanced the
model’s ability to identify seasonality and demand
surges.
Class imbalances were observed in outcomes
related to promotional effectiveness and revenue
distribution. To address this, we employed the
Synthetic Minority Over-sampling Technique
(SMOTE), which generated synthetic samples of
underrepresented classes. This approach ensured
that the model was trained on a balanced dataset,
reducing bias and improving its ability to
generalize across different scenarios.
FEATURE ENGINEERING
To enrich the dataset and capture complex
interactions among variables, we engineered
several new features. Competitor price differences
were calculated as the variance between current
prices and competitors' prices. Revenue per unit
was derived by dividing total revenue by the
purchase quantity. Discount percentages were
computed as the ratio of the discount amount to
the original price. Demand elasticity was estimated
by analyzing the relationship between changes in
price and corresponding variations in purchase
quantity. These features provided the model with
deeper insights into customer behavior and pricing
dynamics. To simulate real-time pricing scenarios,
we augmented the dataset with synthetic
transactions reflecting temporal trends and
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
168
https://www.theamericanjournals.com/index.php/tajet
customer segmentation. By generating additional
data points, we ensured that the model could adapt
to dynamic pricing conditions and anticipate
market fluctuations effectively.
The final dataset was split into training, validation,
and testing sets using an 80:10:10 ratio. Stratified
sampling was applied to maintain the distribution
of critical features, such as product categories and
revenue. Validation of the preprocessed data
included statistical analyses, such as mean and
variance checks, visual inspections using plots and
charts, and correlation analysis to identify
multicollinearity. This step ensured the dataset
was free from inconsistencies and ready for
machine learning model development.
By following this robust preprocessing pipeline,
we transformed the raw dataset into a high-
quality, structured format. This meticulous
preparation was vital to accurately capturing the
complexities of real-time dynamic pricing
strategies
and
ensuring
reliable
model
performance.
MODEL DEVELOPMENT
For the development of dynamic pricing models,
we employed several supervised machine learning
algorithms, each tailored to capture the
complexities of the pricing environment. These
included Linear Regression, Random Forest, and
Gradient Boosting Machines (GBM). Each model
was selected based on its ability to handle various
types of data and capture non-linear relationships,
which are critical in real-time dynamic pricing
scenarios.
The dataset was split into training and testing
subsets using a 70:30 ratio. This ensured that the
models were trained on a substantial portion of the
data while reserving an independent set for
evaluation. Stratified sampling was applied to
maintain the balance of key features across the
splits.
Model Selection
1.
Linear Regression was chosen as a baseline
model due to its simplicity and interpretability. It
allowed us to establish a reference point for more
complex algorithms.
2.
Random Forest was selected for its ability to
handle high-dimensional data and capture non-
linear interactions between features. Its ensemble
nature made it robust to overfitting.
3.
Gradient Boosting Machines (GBM) were
implemented for their capacity to optimize
predictive performance through sequential
learning, leveraging weak learners to form a strong
predictive model.
Hyperparameter Tuning
Hyperparameter optimization was critical for
achieving the best performance from each model.
A grid search strategy was employed in
conjunction with k-fold cross-validation to
systematically
explore
combinations
of
hyperparameters. Key hyperparameters tuned
included:
•
For Linear Regression: Regularization
parameters (e.g., L1/L2 penalties).
•
For Random Forest: Number of trees,
maximum depth, and minimum samples per
leaf.
•
For GBM: Learning rate, number of boosting
iterations, and maximum depth of individual
learners.
Cross-Validation
We used 5-fold cross-validation to ensure that the
model's performance was robust across different
subsets of the data. This iterative training and
validation approach minimized the risk of
overfitting and provided a more reliable estimate
of
model
generalization.To
ensure
the
interpretability of the models, we conducted a
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
169
https://www.theamericanjournals.com/index.php/tajet
feature importance analysis. For Random Forest
and GBM, feature importance scores were derived
based on the contribution of each feature to the
predictive performance. This analysis revealed
that features like competitor price differences,
promotional impact, and revenue per unit were
among the most significant predictors.The models
were implemented in Python using libraries such
as scikit-learn for algorithm development and
pandas for data manipulation. TensorFlow and
XGBoost were explored for further refinement and
scalability of the boosting algorithms.
Model Evaluation
Model evaluation was performed to assess the
predictive accuracy, reliability, and real-time
applicability of the pricing models. The evaluation
process was divided into two main stages:
standard performance metrics and dynamic
pricing simulations.
Standard Performance Metrics
To compare the models effectively, we utilized a
range of evaluation metrics:
•
Mean Absolute Error (MAE): Measured the
average magnitude of errors between
predicted and actual prices, offering a clear
sense of prediction accuracy.
•
Root Mean Square Error (RMSE): Penalized
larger errors more heavily, providing
insight into model robustness against
significant deviations.
•
R-Squared (R²): Assessed the proportion of
variance in the target variable explained by
the model, serving as a measure of
goodness-of-fit.
RESULT
The results of our study demonstrate the
effectiveness of machine learning models in
predicting optimal prices for real-time dynamic
pricing strategies. By leveraging three different
algorithms
—
Linear Regression, Random Forest,
and Gradient Boosting Machines (GBM)
—
we were
able to analyze their performance across several
metrics and evaluate their suitability for the
dynamic nature of retail and e-commerce pricing.
Performance Metrics Overview
The evaluation of the models focused on three key
metrics: Mean Absolute Error (MAE), Root Mean
Square Error (RMSE), and R-Squared (R²). The
results on the test dataset are presented in the
table 1 below:
Table 1: Model Evaluation
Model
MAE RMSE R²
Linear Regression 2.78
3.45
0.81
Random Forest
1.89
2.12
0.92
Gradient Boosting 1.73
2.01
0.94
Linear Regression: This model provided a baseline
for performance evaluation. It achieved an MAE of
2.78, RMSE of 3.45, and an R² of 0.81, indicating
moderate accuracy. However, its inability to
capture non-linear relationships limited its
effectiveness, especially in scenarios involving
complex pricing dependencies.
Random Forest: With an MAE of 1.89 and an RMSE
of 2.12, Random Forest demonstrated significant
improvement over Linear Regression. Its ensemble
learning approach allowed it to capture complex
interactions between features, resulting in a robust
and reliable performance with an R² value of 0.92.
Gradient Boosting Machines (GBM): GBM
outperformed the other models across all metrics.
It achieved the lowest MAE of 1.73 and RMSE of
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
170
https://www.theamericanjournals.com/index.php/tajet
2.01, alongside the highest R² value of 0.94. The
sequential learning nature of GBM allowed it to
minimize errors iteratively, making it particularly
well-suited for dynamic pricing scenarios.
COMPARATIVE ANALYSIS
To better understand the models' relative
performance, a comparative study was conducted:
•
Prediction Accuracy: GBM consistently
produced predictions closest to actual prices,
evidenced by its lower error rates. Random Forest
followed closely, while Linear Regression lagged
behind, particularly in non-linear scenarios.
•
Robustness to Variability: Random Forest
and GBM exhibited strong adaptability to varying
data conditions, such as fluctuating competitor
prices and seasonal demand. Linear Regression
struggled to account for these complexities.
•
Computational Efficiency: While GBM
provided the best performance, it required more
computational resources and longer training times
compared to Random Forest and Linear
Regression. This trade-off may influence model
selection depending on the deployment
environment.
REAL-TIME SIMULATION RESULTS
To validate the models under realistic conditions,
we conducted real-time simulations using test
scenarios that mimicked dynamic market
environments. These scenarios included changes
in competitor pricing, promotional campaigns, and
demand surges. The results were evaluated based
on the following criteria:
•
Revenue Optimization: GBM consistently
optimized revenue more effectively,
adjusting prices dynamically to maximize
profitability without sacrificing demand.
•
Customer Retention: Random Forest and
GBM both demonstrated an ability to
balance price adjustments with customer
satisfaction, retaining high engagement
rates. Linear Regression’s performance in
this area was less effective due to its
simplistic pricing predictions.
Insights and Key Findings
1.
GBM as the Best Performer: The results
clearly indicate that GBM is the most
suitable model for real-time dynamic
pricing. Its ability to handle non-linear
relationships, feature interactions, and
sequential learning allowed it to deliver
superior results.
2.
Random Forest as a Close Alternative: While
not as precise as GBM, Random Forest offers
a robust and computationally efficient
alternative, making it a viable choice in
environments with limited computational
resources.
3.
Limitations of Linear Regression: Linear
Regression is best used as a baseline model
or in simpler pricing scenarios. Its
performance was notably weaker in
dynamic and complex environments.
Visualization of Results
To illustrate the models' performance, we plotted
predicted prices against actual prices for each
model. GBM displayed the tightest fit, closely
aligning with actual values, while Linear
Regression showed greater variance. Random
Forest’s predictions also aligned closely, but with
slightly more variability compared to GBM.
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
171
https://www.theamericanjournals.com/index.php/tajet
Chart 1: Mean Absolute Error (MAE) and Root Mean Square Error (RMSE
This bar chart 1 presents a comparative view of
two important error metrics
—
Mean Absolute
Error (MAE) and Root Mean Square Error
(RMSE)
—
across three machine learning models:
Linear Regression, Random Forest, and Gradient
Boosting Machines (GBM).
•
The Gradient Boosting Machine (GBM)
consistently outperformed other models
across both MAE and RMSE, highlighting its
ability to accurately capture complex
patterns in dynamic pricing scenarios.
•
The Random Forest model also showed
good results and could be considered a
strong candidate if computational efficiency
is a priority.
•
Linear Regression, while computationally
efficient, demonstrated higher errors in
both MAE and RMSE, suggesting its
limitations in complex retail and e-
commerce pricing dynamics.
By selecting models with low MAE and RMSE
values, businesses can optimize pricing decisions,
maximize profit margins, and remain competitive
in the fast-paced retail and e-commerce landscape.
By understanding and analyzing these R² values,
retailers and e-commerce managers can make
informed decisions about model selection,
infrastructure investments, and scalability
considerations for their dynamic pricing strategies
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
172
https://www.theamericanjournals.com/index.php/tajet
Chart 2: R² (R-squared) value
The R² (R-squared) value chart 2, also known as
the coefficient of determination, is a critical metric
in evaluating machine learning models. It
measures the proportion of the variance in the
target variable that can be predicted by the model.
An R² value of 1 indicates a perfect fit, meaning that
the model explains all the variability in the target
data. Conversely, an R² value close to 0 suggests
that the model fails to capture much of the data's
variability.
The results underscore the importance of selecting
models that can adapt to the complexities of
dynamic pricing in real-time. While GBM
performed best in this study, future work could
explore deep learning models like LSTMs or
Transformer-based architectures to capture
temporal and sequential patterns in pricing data.
DISCUSSION AND CONCLUSION
In this study, we have explored the application of
supervised machine learning models
—
Linear
Regression, Random Forest, and Gradient Boosting
Machines (GBM)
—
for real-time dynamic pricing in
retail and e-commerce. Our goal was to determine
the effectiveness of these models in forecasting
optimal prices by assessing their performance
using key metrics such as Mean Absolute Error
(MAE), Root Mean Square Error (RMSE), and R-
squared (R²). The comparative analysis of these
models allowed us to draw meaningful insights
into their strengths and limitations, providing
practical recommendations for businesses aiming
to optimize their pricing strategies.
The results indicate that Gradient Boosting
Machines (GBM) consistently outperformed the
other models across all performance metrics. GBM
achieved
the
lowest
MAE
and
RMSE,
demonstrating superior predictive accuracy and
stability. This suggests that GBM is highly effective
in capturing the complex interactions among
various factors that influence dynamic pricing,
such as demand fluctuations, competitor prices,
and product availability. Businesses can rely on
GBM for more robust and accurate pricing
decisions, which are critical in maintaining a
competitive edge in fast-paced retail and e-
commerce environments.
While the Random Forest model also delivered
good results, it was slightly less accurate than GBM
but still provided satisfactory predictions with a
balanced trade-off between accuracy and
computational efficiency. In many real-world
applications, Random Forest remains a viable
choice due to its scalability and reduced
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
173
https://www.theamericanjournals.com/index.php/tajet
susceptibility to overfitting. On the other hand, the
Linear Regression model, despite its simplicity and
interpretability, showed higher error rates in both
MAE and RMSE, which indicates its limitations in
addressing the non-linear relationships present in
dynamic pricing data.
Another key point is the importance of
hyperparameter tuning and cross-validation in
enhancing the performance of machine learning
models. Our use of grid search and cross-validation
techniques ensured that each model was properly
optimized and tested, which helped us achieve
reliable and accurate predictions. This reinforces
the necessity of rigorous model training and
evaluation
processes
to
ensure
optimal
performance in dynamic pricing applications. The
real-time testing in a simulated environment
further highlighted the practical feasibility of our
methodology. The integration with e-commerce
platforms demonstrated that our models could
make quick adjustments to pricing based on real-
world conditions, ensuring responsiveness to
demand changes and competitor actions. This
adaptability is crucial in a competitive market
where businesses must react swiftly to maintain
profitability and customer satisfaction.
However, it is important to acknowledge the
limitations of our study. The dataset obtained from
Kaggle provided a solid foundation for our
analysis, but it may not fully capture all the unique
challenges and complexities present in specific
retail and e-commerce markets. Factors such as
brand
loyalty,
seasonality,
and
regional
preferences may influence pricing decisions but
were not fully represented in our dataset. Future
research should focus on incorporating more
diverse datasets and real-world data from live
retail and e-commerce environments to provide a
more comprehensive evaluation of machine
learning models for dynamic pricing. Additionally,
computational efficiency and scalability remain
critical considerations for real-world deployment.
While GBM delivered the best accuracy, it is
computationally intensive and may require
significant processing power in large-scale
applications. Organizations must weigh the trade-
offs
between
predictive
accuracy
and
computational cost when selecting a model for
implementation.
In
conclusion,
this
study
successfully
demonstrated the effectiveness of machine
learning models for real-time dynamic pricing
strategies in the retail and e-commerce sectors.
Our comparative analysis of Linear Regression,
Random Forest, and Gradient Boosting Machines
(GBM) highlighted that GBM consistently delivered
superior performance in terms of prediction
accuracy and stability. The use of metrics such as
Mean Absolute Error (MAE), Root Mean Square
Error (RMSE), and R-squared (R²) provided a
comprehensive evaluation of each model’s
predictive performance. We have shown that
machine learning models can effectively capture
complex interactions in dynamic pricing data,
allowing businesses to optimize pricing strategies
in real-time. The results emphasize the necessity of
proper hyperparameter tuning, cross-validation,
and integration with e-commerce infrastructure to
ensure real-world applicability. Businesses can
leverage these insights to make informed decisions
about pricing strategies, ensuring higher
profitability, better customer engagement, and
sustained competitiveness in the market.
Although our research relied on a Kaggle dataset
and simulated environments, it lays the
groundwork for future investigations into more
intricate and real-world scenarios. Expanding the
scope of datasets and including factors such as
seasonality, regional preferences, and consumer
behavior would provide more robust insights and
actionable strategies for dynamic pricing in
specific markets.As machine learning continues to
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
174
https://www.theamericanjournals.com/index.php/tajet
evolve, businesses in retail and e-commerce must
stay informed about technological advancements
and continuously refine their models and pricing
algorithms. Embracing ensemble models like
Gradient Boosting Machines and considering
trade-offs in computational efficiency will be
essential in staying ahead of competitors and
responding swiftly to market changes.Ultimately,
adopting machine learning-driven dynamic pricing
models enables businesses to optimize their
operations, improve customer satisfaction, and
maximize profitability. By investing in research,
technology,
and strategic implementation,
companies can harness the full potential of
machine learning to drive smarter, data-driven
pricing decisions, ensuring long-term success in
the highly competitive retail and e-commerce
landscape
ACKNOWLEDGEMENT:
All the author contributed
equally
REFERENCE
1.
Md Habibur Rahman, Ashim Chandra Das,
Md Shujan Shak, Md Kafil Uddin, Md
Imdadul Alam, Nafis Anjum, Md Nad Vi Al
Bony,
&
Murshida
Alam.
(2024).
TRANSFORMING CUSTOMER RETENTION
IN
FINTECH
INDUSTRY
THROUGH
PREDICTIVE ANALYTICS AND MACHINE
LEARNING. The American Journal of
Engineering and Technology, 6(10), 150
–
163.
https://doi.org/10.37547/tajet/Volume06I
ssue10-17
2.
Chen, Y., Donaldson, J., & McMillan, M. S.
(2001). Market Dynamics and Economic
Impacts of Real-Time Pricing in Competitive
Markets. Economics Review, 58(4), 567-
590.
3.
Clements, M. P., Harris, M. N., & Szafarz, A.
(2004). Econometric Modeling of Dynamic
Pricing Systems. Journal of Economic
Surveys, 18(5), 715-750.
4.
Friedman, J. H. (2001). Greedy Function
Approximation: A Gradient Boosting
Machine. The Annals of Statistics, 29(5),
1189-1232.
5.
Gal-Or, E. (1985). Strategic Pricing of New
Products in Markets with Network
Externalities.
Quarterly
Journal
of
Economics, 100(2), 295-308.
6.
Hyndman, R. J., & Athanasopoulos, G. (2018).
Forecasting: Principles and Practice. OTexts.
7.
Kannan, P. K., & Kopalle, P. K. (2001).
Dynamic
Pricing
on
the
Internet:
Optimization and Consumer Behavior.
Marketing Science, 20(1), 42-61.
8.
Kumar, A., Gupta, S., & Mehta, R. (2019).
Real-Time Dynamic Pricing Strategies in
Retail and E-commerce. International
Journal of Machine Learning Research,
20(6), 456-473.
9.
Lemke, A., Grinblatt, M., & Kannan, S. (2019).
Ensemble Models for Competitive Market
Pricing. Computational Economics, 30(4),
678-702.
10.
McKinsey & Company. (2020). Artificial
Intelligence in E-commerce and Retail
Pricing Strategies.
11.
Waller, D. S., & Leigh, T. W. (2009). Social
Psychological
Perspectives
on
Price
Magnitude Effects. Advances in Consumer
Research, 36(1), 494-500.
12.
Yuan, Y., Zhang, T., & Huang, W. (2019).
Cloud Integration for Scalable Dynamic
Pricing Applications. IEEE Transactions on
Cloud Computing, 7(4), 785-798.
13.
Zhou, Y., Zhao, Y., & Zhang, X. (2017).
Comparative Study of Machine Learning
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
175
https://www.theamericanjournals.com/index.php/tajet
Models for Demand Forecasting in Retail.
Journal of Business Analytics, 12(3), 233-
250.
14.
Md Habibur Rahman, Ashim Chandra Das,
Md Shujan Shak, Md Kafil Uddin, Md
Imdadul Alam, Nafis Anjum, Md Nad Vi Al
Bony,
&
Murshida
Alam.
(2024).
TRANSFORMING CUSTOMER RETENTION
IN
FINTECH
INDUSTRY
THROUGH
PREDICTIVE ANALYTICS AND MACHINE
LEARNING. The American Journal of
Engineering and Technology, 6(10), 150
–
163.
https://doi.org/10.37547/tajet/Volume06I
ssue10-17
15.
Tauhedur Rahman, Md Kafil Uddin,
Biswanath
Bhattacharjee,
Md
Siam
Taluckder, Sanjida Nowshin Mou, Pinky
Akter, Md Shakhaowat Hossain, Md Rashel
Miah, & Md Mohibur Rahman. (2024).
BLOCKCHAIN APPLICATIONS IN BUSINESS
OPERATIONS
AND
SUPPLY
CHAIN
MANAGEMENT BY MACHINE LEARNING.
International Journal of Computer Science &
Information
System,
9(11),
17
–
30.
https://doi.org/10.55640/ijcsis/Volume09
Issue11-03
16.
Md Jamil Ahmmed, Md Mohibur Rahman,
Ashim Chandra Das, Pritom Das, Tamanna
Pervin, Sadia Afrin, Sanjida Akter Tisha, Md
Mehedi Hassan, & Nabila Rahman. (2024).
COMPARATIVE ANALYSIS OF MACHINE
LEARNING ALGORITHMS FOR BANKING
FRAUD DETECTION: A STUDY ON
PERFORMANCE, PRECISION, AND REAL-
TIME APPLICATION. International Journal
of Computer Science & Information System,
9(11),
31
–
44.
https://doi.org/10.55640/ijcsis/Volume09
Issue11-04
17.
Bhandari, A., Cherukuri, A. K., & Kamalov, F.
(2023). Machine learning and blockchain
integration for security applications. In Big
Data Analytics and Intelligent Systems for
Cyber Threat Intelligence (pp. 129-173).
River Publishers.
18.
Diro, A., Chilamkurti, N., Nguyen, V. D., &
Heyne, W. (2021). A comprehensive study of
anomaly detection schemes in IoT networks
using machine learning algorithms. Sensors,
21(24), 8320.
19.
Nafis Anjum, Md Nad Vi Al Bony, Murshida
Alam, Mehedi Hasan, Salma Akter, Zannatun
Ferdus, Md Sayem Ul Haque, Radha Das, &
Sadia Sultana. (2024). COMPARATIVE
ANALYSIS OF SENTIMENT ANALYSIS
MODELS ON BANKING INVESTMENT
IMPACT
BY
MACHINE
LEARNING
ALGORITHM. International Journal of
Computer Science & Information System,
9(11),
5
–
16.
https://doi.org/10.55640/ijcsis/Volume09
Issue11-02
20.
Shahbazi, Z., & Byun, Y. C. (2021).
Integration of blockchain, IoT and machine
learning for multistage quality control and
enhancing security in smart manufacturing.
Sensors, 21(4), 1467.
21.
Das, A. C., Mozumder, M. S. A., Hasan, M. A.,
Bhuiyan, M., Islam, M. R., Hossain, M. N., ... &
Alam, M. I. (2024). MACHINE LEARNING
APPROACHES
FOR
DEMAND
FORECASTING: THE IMPACT OF CUSTOMER
SATISFACTION
ON
PREDICTION
ACCURACY. The American Journal of
Engineering and Technology, 6(10), 42-53.
22.
Al Mamun, A., Hossain, M. S., Rishad, S. S. I.,
Rahman, M. M., Tisha, S. A., Shakil, F., ... &
Sultana, S. (2024). MACHINE LEARNING
FOR
STOCK
MARKET
SECURITY
MEASUREMENT:
A
COMPARATIVE
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
176
https://www.theamericanjournals.com/index.php/tajet
ANALYSIS
OF
SUPERVISED,
UNSUPERVISED, AND DEEP LEARNING
MODELS. International journal of networks
and security, 4(01), 22-32.
23.
Akter, S., Mahmud, F., Rahman, T., Ahmmed,
M. J., Uddin, M. K., Alam, M. I., ... & Jui, A. H.
(2024). A COMPREHENSIVE STUDY OF
MACHINE LEARNING APPROACHES FOR
CUSTOMER SENTIMENT ANALYSIS IN
BANKING SECTOR. The American Journal of
Engineering and Technology, 6(10), 100-
111.
24.
Shahid, R., Mozumder, M. A. S., Sweet, M. M.
R., Hasan, M., Alam, M., Rahman, M. A., ... &
Islam, M. R. (2024). Predicting Customer
Loyalty in the Airline Industry: A Machine
Learning Approach Integrating Sentiment
Analysis and User Experience. International
Journal on Computational Engineering, 1(2),
50-54.
25.
Md Risalat Hossain Ontor, Asif Iqbal, Emon
Ahmed, Tanvirahmedshuvo, & Ashequr
Rahman. (2024). LEVERAGING DIGITAL
TRANSFORMATION AND SOCIAL MEDIA
ANALYTICS FOR OPTIMIZING US FASHION
BRANDS’ PERFORMANCE: A MACHINE
LEARNING
APPROACH.
International
Journal of Computer Science & Information
System,
9(11),
45
–
56.
https://doi.org/10.55640/ijcsis/Volume09
Issue11-05
26.
Rahman, A., Iqbal, A., Ahmed, E., & Ontor, M.
R. H. (2024). PRIVACY-PRESERVING
MACHINE
LEARNING:
TECHNIQUES,
CHALLENGES, AND FUTURE DIRECTIONS
IN SAFEGUARDING PERSONAL DATA
MANAGEMENT. International journal of
business and management sciences, 4(12),
18-32.
27.
COMPARATIVE PERFORMANCE ANALYSIS
OF MACHINE LEARNING ALGORITHMS FOR
BUSINESS INTELLIGENCE: A STUDY ON
CLASSIFICATION
AND
REGRESSION
MODELS. (2024). International Journal of
Business and Management Sciences, 4(11),
06-18. https://doi.org/10.55640/ijbms-04-
11-02
28.
Naznin, R., Sarkar, M. A. I., Asaduzzaman, M.,
Akter, S., Mou, S. N., Miah, M. R., ... & Sajal, A.
(2024). ENHANCING SMALL BUSINESS
MANAGEMENT
THROUGH
MACHINE
LEARNING: A COMPARATIVE STUDY OF
PREDICTIVE MODELS FOR CUSTOMER
RETENTION, FINANCIAL FORECASTING,
AND
INVENTORY
OPTIMIZATION.
International Interdisciplinary Business
Economics Advancement Journal, 5(11), 21-
32.
29.
Md Jamil Ahmmed, Md Mohibur Rahman,
Ashim Chandra Das, Pritom Das, Tamanna
Pervin, Sadia Afrin, Sanjida Akter Tisha, Md
Mehedi Hassan, & Nabila Rahman. (2024).
COMPARATIVE ANALYSIS OF MACHINE
LEARNING ALGORITHMS FOR BANKING
FRAUD DETECTION: A STUDY ON
PERFORMANCE, PRECISION, AND REAL-
TIME APPLICATION. International Journal
of Computer Science & Information System,
9(11),
31
–
44.
https://doi.org/10.55640/ijcsis/Volume09
Issue11-04
30.
Arif, M., Ahmed, M. P., Al Mamun, A., Uddin,
M. K., Mahmud, F., Rahman, T., ... & Helal, M.
(2024). DYNAMIC PRICING IN FINANCIAL
TECHNOLOGY: EVALUATING MACHINE
LEARNING SOLUTIONS FOR MARKET
ADAPTABILITY.
International
Interdisciplinary
Business
Economics
Advancement Journal, 5(10), 13-27.
31.
Iqbal, A., Ahmed, E., Rahman, A., & Ontor, M.
R. H. (2024). ENHANCING FRAUD
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE12
177
https://www.theamericanjournals.com/index.php/tajet
DETECTION AND ANOMALY DETECTION IN
RETAIL BANKING USING GENERATIVE AI
AND MACHINE LEARNING MODELS.
International journal of networks and
security, 4(01), 33-43.
32.
Rahman, M. M., Akhi, S. S., Hossain, S., Ayub,
M. I., Siddique, M. T., Nath, A., ... & Hassan, M.
M.
(2024).
EVALUATING
MACHINE
LEARNING
MODELS
FOR
OPTIMAL
CUSTOMER SEGMENTATION IN BANKING:
A COMPARATIVE STUDY. The American
Journal of Engineering and Technology,
6(12), 68-83.
33.
Bhattacharjee, B., Mou, S. N., Hossain, M. S.,
Rahman, M. K., Hassan, M. M., Rahman, N., ...
& Haque, M. S. U. (2024). MACHINE
LEARNING FOR COST ESTIMATION AND
FORECASTING
IN
BANKING:
A
COMPARATIVE
ANALYSIS
OF
ALGORITHMS. International journal of
business and management sciences, 4(12),
6-17.