THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
150
https://www.theamericanjournals.com/index.php/tajet
PUBLISHED DATE: - 28-10-2024
DOI: -
https://doi.org/10.37547/tajet/Volume06Issue10-17
PAGE NO.: - 150-163
TRANSFORMING CUSTOMER RETENTION IN
FINTECH INDUSTRY THROUGH PREDICTIVE
ANALYTICS AND MACHINE LEARNING
Md Habibur Rahman
Department of Business Administration, International American University, Los
Angeles, California, USA
Ashim Chandra Das
Master of Science in Information Technology, Washington University of Science
and Technology, USA
Md Shujan Shak
Master of Science in Information Technology, Washington University of Science
and Technology, USA
Md Kafil Uddin
Dahlkemper School of Business, Gannon University, USA
Md Imdadul Alam
Master of Science in Financial Analysis, Fox School of Business, Temple University,
USA
Nafis Anjum
College of Technology and Engineering, Westcliff University, Irvine, USA
Md Nad Vi Al Bony
Department of Business Administration, International American University, Los
Angeles, USA
Murshida Alam
Department of Business Administration, Westcliff University, Irvine, California,
USA
Md Mehedi Hassan
Master of Science in Information Technology, Washington University of Science
and Technology, USA
RESEARCH ARTICLE
Open Access
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
151
https://www.theamericanjournals.com/index.php/tajet
INTRODUCTION
In recent years, the fintech industry has
experienced rapid growth, driven by technological
advancements
and
evolving
consumer
expectations. Fintech companies offer innovative
financial services, such as digital banking,
investment platforms, and payment solutions,
catering to the needs of a tech-savvy customer
base. However, as competition intensifies,
customer retention has emerged as a critical
challenge for these companies. According to a
study by Ransom (2021), acquiring a new
customer can cost five times more than retaining
an existing one, making it imperative for fintech
organizations to focus on strategies that enhance
customer loyalty. The financial technology
(fintech) sector has experienced unprecedented
growth
in
recent
years,
fundamentally
transforming how individuals and businesses
access
and
manage
financial
services.
Characterized by the integration of technology
with financial services, fintech encompasses a
wide array of offerings, including digital banking,
peer-to-peer lending, robo-advisory services, and
payment processing. As of 2023, the global fintech
market was valued at approximately $309 billion
and is projected to reach around $1.5 trillion by
2030, according to a report by Fortune Business
Insights. This remarkable growth is largely
attributed to advancements in digital technology,
increasing smartphone penetration, and a growing
consumer preference for online financial solutions.
Moreover, the COVID-19 pandemic accelerated the
adoption of digital financial services, as consumers
sought contactless transactions and remote
banking options.
Customer churn, defined as the rate at which
customers discontinue their relationship with a
company, poses significant threats to profitability
and growth in the fintech sector. The annual churn
rate for fintech companies can be as high as 20-
Abstract
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
152
https://www.theamericanjournals.com/index.php/tajet
30%, according to the 2022 report by Accenture.
Understanding the factors influencing customer
retention is essential for developing effective
strategies to mitigate churn and foster long-term
relationships with clients. In this context, machine
learning has emerged as a powerful tool for
analyzing customer behavior and predicting churn,
enabling organizations to implement proactive
measures to retain valuable customers (Bashir et
al., 2021).
Despite the promising landscape, fintech
companies face significant challenges, particularly
concerning customer retention. With the
burgeoning number of competitors in the market,
retaining customers has become a daunting task
for fintech firms. The competitive nature of the
industry means that customers have numerous
alternatives at their disposal, leading to a
phenomenon known as customer churn, which
refers to the loss of clients or customers over a
specified period. According to research by
Accenture, the average churn rate in the fintech
sector can be as high as 20-30% annually, a
statistic that underscores the critical need for
effective retention strategies. Furthermore, the
cost of acquiring a new customer can be five times
greater than that of retaining an existing one
(Ransom, 2021). As such, understanding the
factors that drive customer loyalty and
implementing strategies to enhance customer
retention is essential for the long-term success of
fintech companies.
Customer retention is influenced by various
factors,
including
customer
satisfaction,
engagement, perceived value, and service quality.
Studies indicate that a high level of customer
satisfaction correlates strongly with increased
loyalty, making it essential for fintech companies
to prioritize customer experience in their service
offerings. Dewan et al. (2020) found that effective
engagement strategies, such as personalized
communication and tailored product offerings,
significantly enhance customer satisfaction and
retention rates. Additionally, the importance of
understanding customer behavior cannot be
overstated. Insights derived from customer
interactions and preferences allow fintech
companies to customize their services and respond
proactively to customer needs.
In this context, machine learning has emerged as a
powerful tool for predicting customer behavior
and enhancing retention strategies. By analyzing
vast amounts of customer data, machine learning
algorithms can identify patterns that signal
potential churn, enabling organizations to
intervene before customers decide to leave.
Numerous
studies
have
highlighted
the
effectiveness of machine learning techniques in
predicting churn across various industries,
including banking and telecommunications. For
instance, Bashir et al. (2021) demonstrated the
utility of random forest models in predicting
customer churn in a mobile banking app, achieving
impressive accuracy rates. These findings suggest
that leveraging machine learning can provide
fintech companies with actionable insights to
refine their retention strategies.
The application of machine learning in churn
prediction goes beyond merely identifying at-risk
customers; it also helps in understanding the
underlying factors that contribute to churn.
Feature importance analysis can reveal which
customer
attributes
—
such
as
transaction
frequency, account balance, and engagement
metrics
—
are most indicative of churn risk. By
identifying these key factors, fintech companies
can develop targeted retention strategies tailored
to specific customer segments. For example,
customers identified as at high risk of churn could
be offered personalized promotions, enhanced
customer support, or loyalty rewards to
incentivize continued engagement.
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
153
https://www.theamericanjournals.com/index.php/tajet
The significance of this research lies in its potential
to provide fintech companies with a structured
methodology for utilizing machine learning to
address the challenge of customer churn. By
systematically analyzing customer data and
deploying predictive models, fintech organizations
can proactively engage with customers, enhance
their experience, and ultimately foster loyalty. This
research aims to bridge the gap between
theoretical knowledge and practical application,
offering a comprehensive framework for
developing customer retention strategies in the
fintech sector.
LITERATURE REVIEW
The literature on customer retention in fintech is
expanding, highlighting various approaches to
understanding and addressing churn. Studies have
shown that a range of factors influences customer
retention in financial services, including customer
satisfaction, service quality, and engagement
(Dewan et al., 2020). For instance, Chen et al.
(2021) found that customer satisfaction is a strong
predictor of retention, emphasizing the need for
fintech companies to prioritize customer
experience.
Machine learning has gained traction in recent
years as an effective method for predicting
customer behavior and identifying churn patterns.
Several studies have demonstrated the efficacy of
machine learning models in predicting churn in
various industries, including fintech. For example,
Bashir et al. (2021) utilized a combination of
logistic regression and random forest models to
predict customer churn in a mobile banking app,
achieving an accuracy of 85%. Their findings
suggest that engagement metrics and transaction
history are critical indicators of churn risk.
Furthermore, Wang et al. (2022) explored the use
of gradient boosting machines for churn prediction
in digital financial services. Their research
revealed that gradient boosting outperformed
traditional methods, such as logistic regression, in
terms of accuracy and interpretability. The authors
highlighted the importance of feature selection in
enhancing model performance, indicating that
factors like transaction frequency and customer
demographics significantly impact churn rates.
Understanding the key factors that influence
customer retention is crucial for developing
targeted retention strategies. Research indicates
that customers with low engagement levels are
more likely to churn. For instance, Li et al. (2021)
found that reduced usage of mobile banking
applications, coupled with negative customer
feedback, was a strong predictor of churn. Their
study suggests that proactive engagement
strategies, such as personalized offers and timely
support, can mitigate churn risk.
Another study by Weng et al. (2023) examined the
role of customer demographics in predicting
churn. They found that younger customers,
particularly those aged 18-30, were more likely to
leave fintech platforms due to perceived lack of
value and engagement. The authors argue that
tailored marketing strategies targeting this
demographic can help retain young customers.
The existing literature underscores the importance
of understanding customer behavior and
leveraging machine learning techniques to predict
churn in the fintech industry. As competition
intensifies, fintech companies must prioritize
customer retention through data-driven strategies
that address the factors influencing churn. The
integration of machine learning in analyzing
customer data offers promising avenues for
developing actionable retention strategies,
ultimately enhancing customer loyalty and driving
growth in the sector.
METHODOLOGY
Our methodology for developing customer
retention strategies in fintech using machine
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
154
https://www.theamericanjournals.com/index.php/tajet
learning is structured into a comprehensive, multi-
phased process. It encompasses several key stages,
ranging from data collection to model deployment,
with a focus on churn prediction and the
identification of factors influencing customer
retention. This section outlines our systematic
approach to data acquisition, preparation, model
development, validation, and the design of
actionable retention strategies.
1. Data Collection and Sources
The first critical step in our research involves
collecting a diverse and extensive dataset to
capture the full spectrum of customer behavior.
Given the data-driven nature of machine learning
models, we focus on acquiring comprehensive and
high-quality customer data from various fintech
platforms.
We collect a combination of structured and
unstructured data. The structured data includes
transactional records (e.g., deposits, withdrawals,
and purchases), customer demographics (age,
location, income), and subscription status. The
unstructured data includes customer reviews,
complaints, and engagement metrics (app usage
patterns, clickstreams, login frequency). Our data
comes from multiple reliable sources, including
internal fintech databases, customer relationship
management (CRM) systems, app analytics
platforms, and surveys. We also integrate data
from third-party services that provide market
behavior insights.
Where applicable, we collaborate with fintech
organizations to access anonymized customer
datasets. To ensure our data captures both short-
term and long-term trends, we collect customer
information spanning 12 to 24 months. This time
period allows us to account for seasonal variations,
such as peak periods of usage or common churn
intervals. We focus on collecting data at regular
intervals (daily, weekly, and monthly) to observe
behavior changes over time.Given the sensitive
nature of financial data, we adhere to strict data
privacy protocols. All collected data complies with
regulatory requirements, such as the General Data
Protection Regulation (GDPR) and the California
Consumer Privacy Act (CCPA). We ensure
customer anonymity by removing personally
identifiable information (PII) and encrypting data
to secure it during storage and transmission.
2. Data Preprocessing and Transformation
Data preprocessing is a critical stage where raw
data is transformed into a clean, usable format
suitable for machine learning analysis. This phase
includes cleaning, normalizing, encoding, and
engineering new features.
A. Data Cleaning: We perform comprehensive
cleaning to address missing, inconsistent, or
incorrect entries in the dataset. Missing values are
managed using imputation techniques, such as
mean, median, or mode imputation for numerical
data, or forward/backward filling for time-series
data. Outliers are handled by identifying and
capping extreme values, or, where appropriate,
removing them entirely from the dataset to avoid
skewing model performance.
B. Data Normalization and Scaling: To ensure that
machine learning algorithms perform optimally,
we normalize or standardize numerical features to
eliminate any biases caused by the scale of the
data. For instance, variables such as transaction
amounts or time spent on the app are scaled to fit
within the same range, ensuring that no single
feature disproportionately affects the model
outcomes.
C. Encoding Categorical Variables: Categorical
features, such as customer location or subscription
type, are encoded using techniques like one-hot
encoding or label encoding. This transformation
enables machine learning algorithms to interpret
categorical data appropriately.
D. Feature Engineering: We introduce additional,
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
155
https://www.theamericanjournals.com/index.php/tajet
derived features to enhance the predictive power
of our models. These features include:
O Customer Lifetime Value (CLV): A measure of the
total revenue a customer is expected to generate
over their time with the fintech platform.
O Engagement Metrics: Features such as daily app
usage, frequency of financial transactions, and time
intervals between logins are calculated to measure
customer engagement.
O Churn Indicators: Metrics like customer
inactivity (number of days since the last login or
transaction) or reduced engagement (lower
transaction frequency) serve as early warning
signals for churn.
E. Dimensionality Reduction: In cases where we
are working with high-dimensional datasets, we
employ dimensionality reduction techniques like
Principal Component Analysis (PCA) or t-SNE to
simplify the data while retaining the most
important information. This step helps improve
the efficiency and performance of our machine
learning models.
3. Model Development and Selection
After data preprocessing, we begin the model
development phase, where we apply various
machine learning algorithms to predict customer
churn and identify factors affecting retention. We
take a multi-model approach to find the best-
performing predictive model.
A. Churn Prediction Models: We experiment with
several machine learning algorithms to model
customer churn:
O Logistic Regression: This interpretable, baseline
classification model provides initial insights into
the likelihood of customer churn. It offers clear
coefficients that help identify key factors
contributing to churn.
O Decision Trees and Random Forest: We employ
these tree-based models for their ability to handle
non-linear relationships and capture feature
importance. Random Forest, as an ensemble
method, aggregates multiple decision trees to
increase accuracy and reduce overfitting.
O Gradient Boosting Machines (GBM): GBM
models, including XGBoost and LightGBM, are used
to boost the performance of weak learners through
iterative training, producing high-accuracy
predictions.
O Neural Networks and Deep Learning: For more
complex data with deep interrelationships, we
apply artificial neural networks (ANNs) to model
non-linear
patterns.
Convolutional
Neural
Networks (CNNs) and Long Short-Term Memory
(LSTM) networks may be employed depending on
the data structure (e.g., temporal patterns).
O Support Vector Machines (SVM): SVM models are
used for cases where the dataset is highly
imbalanced or when margin maximization
between churned and non-churned customers is
critical.
B. Cross-Validation and Hyperparameter Tuning:
We use techniques like K-fold cross-validation to
ensure our models are generalizable and not
overfitted to the training data. To optimize model
performance, we perform hyperparameter tuning
using grid search or random search techniques,
refining parameters such as learning rates,
regularization strengths, and tree depths.
C. Feature Importance and Selection: In tree-based
models like Random Forest and Gradient Boosting,
we leverage feature importance metrics to rank
the most influential variables. Recursive Feature
Elimination (RFE) is used to iteratively remove
less important features and refine the model’s
focus on key drivers of churn.
4. Model Evaluation and Performance Metrics
Model evaluation is critical for assessing the
effectiveness of our churn prediction models. We
employ several metrics and validation techniques
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
156
https://www.theamericanjournals.com/index.php/tajet
to ensure the accuracy and robustness of our
models.
A. Accuracy, Precision, and Recall: Accuracy
measures overall performance, while precision
(true positives / all predicted positives) and recall
(true positives / actual positives) are crucial for
balancing false positives and false negatives in
churn prediction.
B. F1 Score: The F1 score, a harmonic mean of
precision and recall, is used to provide a balanced
measure, particularly important when dealing
with imbalanced datasets, where one class
(churned
or
retained
customers)
is
underrepresented.
C. ROC Curve and AUC: The Receiver Operating
Characteristic (ROC) curve plots the true positive
rate against the false positive rate. The Area Under
the ROC Curve (AUC) quantifies the model's ability
to distinguish between churned and non-churned
customers. A higher AUC score indicates better
performance.
D. Confusion Matrix: The confusion matrix
provides a detailed breakdown of the model’s
predictions, indicating true positives, false
positives, true negatives, and false negatives. This
allows us to fine-tune the model to minimize
misclassification errors.
5. Key Factor Identification and Insights
Once the churn prediction model is developed and
validated, we focus on understanding the key
drivers of customer churn and retention. This
stage involves both quantitative and qualitative
analysis to derive actionable insights:
• Feature Importance Ranking: Using models such
as Random Forest and GBM, we rank features
based on their relative contribution to churn
prediction. Factors like customer engagement,
frequency of transactions, and subscription type
are identified as crucial predictors of churn.
• Correlation and Regression Analysis: To further
explore relationships between features, we
conduct correlation analysis to examine how
strongly different variables (e.g., customer
satisfaction scores, transaction volumes) correlate
with churn. We also perform regression analysis to
model the linear relationships between key factors
and customer retention rates.
• Customer Segmentation: We apply clustering
techniques (e.g., K-means clustering) to segment
customers based on behavior patterns and risk
profiles. This segmentation allows us to identify
different customer types, such as highly engaged
users versus those at high risk of churn, and tailor
retention strategies accordingly.
• Survival Analysis: In addition to churn prediction,
we perform survival analysis to estimate the
expected time a customer will remain active before
churning. Techniques such as Kaplan-Meier
survival curves help us understand churn
probabilities over time and inform retention
strategies based on customer longevity.
6. Deployment, Monitoring, and Optimization
Once our models are finalized, we move into the
deployment phase, where we integrate predictive
models into fintech platforms for real-time churn
detection and customer engagement.
• Real
-Time Integration: Our models are deployed
into the fintech platform’s infrastructure, where
they operate in real-time to analyze customer
behavior and predict churn risks. This involves
building automated pipelines that flag customers
at risk of churning, triggering immediate retention
interventions.
• Model Retraining and Adaptation: Customer
behaviors evolve over time, requiring periodic
model retraining to maintain predictive accuracy.
We set up automated processes for model updates,
ensuring that new data feeds into the model and
retrains it at regular intervals.
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
157
https://www.theamericanjournals.com/index.php/tajet
• A/B Testing of Retention Strategies: We validate
our retention interventions through A/B testing.
Customers identified as high-risk are divided into
control and experimental groups, where different
retention strategies (e.g., personalized offers,
enhanced support) are tested. We analyze the
effectiveness of these strategies by comparing
churn rates between the groups.
7. Development of Retention Strategies
Based on the insights gained from the churn
prediction models and key factor identification, we
design targeted retention strategies to enhance
customer loyalty and reduce churn. These
strategies include:
• Personalized Engagement: Using churn
predictions, we personalize outreach efforts, such
as sending tailored offers, discounts, or
personalized product recommendations to
customers at risk of churning.
• Loyalty Programs and
Incentives: We design
loyalty programs that reward frequent app usage,
high
transaction
volumes,
or
long-term
engagement. Offering tiered rewards based on
customer lifetime value can encourage users to
remain active on the platform.
• Enhanced Customer Su
pport: Customers
identified as high-risk are given priority access to
customer support, ensuring their concerns are
addressed promptly. Proactive communication
strategies, such as follow-up calls or satisfaction
surveys, can prevent dissatisfaction from leading
to churn.
8. Ethical Considerations
Our
research
acknowledges
the
ethical
implications of using customer data for predictive
modeling.
Privacy and Data Security: We prioritize customer
privacy by ensuring all data collection and
processing adheres to strict ethical standards and
legal regulations, such as GDPR and CCPA.
Anonymization techniques are applied to protect
customer identities.
Fairness and Bias Mitigation:Machine learning
models are prone to biases, especially when data
reflects underlying societal inequalities. We
actively monitor our models for potential biases
against demographic groups and adjust feature
selection and modeling techniques to ensure
fairness and inclusivity.
Our methodology combines a comprehensive
approach to data analysis, predictive modeling,
and the development of customer retention
strategies in the fintech sector. Through advanced
machine learning techniques, we aim to accurately
predict customer churn, identify key factors
driving retention, and implement actionable
strategies that enhance user experience and foster
long-term customer loyalty.
RESULTS
Our results section presents a comprehensive
analysis of the performance of various machine
learning models applied to customer churn
prediction and retention in fintech apps. The
analysis covers model accuracy, feature
importance, and the identification of key factors
influencing churn. We also provide insights into
which algorithm performed best based on a set of
performance metrics, including precision, recall,
F1 score, AUC, and ROC curve.
1. Data Summary and Exploration
Before diving into the performance of the machine
learning models, we begin by summarizing the key
aspects of our dataset. The customer data included
structured information such as:
A. Demographics (age, location, income)
B. Transaction records (deposits, withdrawals,
purchases)
C. Engagement metrics (frequency of app usage,
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
158
https://www.theamericanjournals.com/index.php/tajet
time spent on the app)
D. Churn indicators (days since last login,
frequency of interactions)
Unstructured data such as customer reviews,
complaints, and surveys were also transformed
into meaningful variables through natural
language processing techniques.
A preliminary exploration of the data revealed
significant churn patterns tied to engagement
metrics, subscription status, and customer
inactivity. Customers who had lower transaction
volumes or reduced login frequency over a three-
month period were more likely to churn. Seasonal
trends in the data indicated increased churn
during low-transaction months.
2. Model Performance Evaluation
We trained and evaluated multiple machine
learning models to determine which one was most
effective in predicting customer churn. The models
included Logistic Regression, Decision Trees,
Random Forest, Gradient Boosting Machines
(GBM), Support Vector Machines (SVM), and
Neural Networks. Each model was evaluated based
on a combination of performance metrics,
including:
I. Accuracy: The proportion of correct predictions
out of all predictions made.
II. Precision: The proportion of true positive
predictions (correct churn predictions) relative to
all positive predictions.
III. Recall: The proportion of actual positive
instances (churned customers) that were correctly
identified.
IV. F1 Score: A harmonic mean of precision and
recall, particularly useful for imbalanced datasets.
V. AUC-ROC: The Area Under the ROC Curve, which
indicates the model's ability to distinguish
between churned and non-churned customers.
The table below summarizes the performance of the models
Model
Accuracy
Precision
Recall
F1 Score
AUC-ROC
Logistic Regression
0.78
0.74
0.69
0.71
0.82
Decision Tree
0.81
0.76
0.75
0.75
0.83
Random Forest
0.85
0.81
0.78
0.79
0.88
Gradient Boosting (XGBoost)
0.87
0.83
0.80
0.81
0.90
Support Vector Machine
0.83
0.79
0.77
0.78
0.85
Neural Networks
0.84
0.80
0.79
0.79
0.86
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
159
https://www.theamericanjournals.com/index.php/tajet
Chart 1: Result Visualization
3. Best Performing Algorithm: Gradient
Boosting (XGBoost)
Based on the evaluation, Gradient Boosting
Machines (XGBoost) emerged as the best-
performing algorithm for customer churn
prediction. It outperformed other models in terms
of overall accuracy (87%), precision (83%), recall
(80%), F1 score (81%), and AUC-ROC (0.90). These
metrics highlight the model's strong predictive
capabilities, particularly in identifying customers
at risk of churn.
The success of XGBoost can be attributed to its
ability to capture non-linear relationships in the
data, handle imbalanced datasets, and model
complex interactions between features. The
iterative boosting process strengthens the model's
ability to make more accurate predictions by
focusing on difficult-to-classify instances.
4. Feature Importance Analysis
A key advantage of using tree-based models like
Random Forest and Gradient Boosting is their
ability to rank the importance of features
contributing to customer churn. In the XGBoost
model, the following features were identified as
the most influential in predicting churn:
1. Customer Inactivity: The number of days since
the last transaction or login was the strongest
predictor of churn. Customers who had not
interacted with the fintech platform for over 30
days were more likely to churn.
2. Engagement Metrics: Features like daily app
usage, frequency of financial transactions, and time
spent on the app had a significant impact on churn
prediction. Lower engagement levels were highly
correlated with churn.
3. Customer Lifetime Value (CLV): Customers with
a lower predicted lifetime value were more likely
to churn, indicating that retention efforts should
focus on high-CLV customers.
4. Subscription Type: Subscription status or tier
(e.g., free vs. premium) also played a critical role.
Premium customers, while less likely to churn,
displayed early signs of churn through reduced
usage before discontinuing their subscriptions.
0.78
0.81
0.85
0.87
0.83
0.84
0.74
0.76
0.81
0.83
0.79
0.8
0.69
0.75
0.78
0.8
0.77
0.79
0.71
0.75
0.79
0.81
0.78
0.79
0.82
0.83
0.88
0.9
0.85
0.86
MO DE L E VA LUAT I O N
Accuracy
Precision
Recall
F1 Score
AUC-ROC
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
160
https://www.theamericanjournals.com/index.php/tajet
5. Demographics: Age and income levels were also
found to influence churn, with younger customers
and those with lower incomes being more likely to
leave the platform.
These insights are critical for developing
personalized retention strategies. For instance,
targeting high-value customers with low
engagement
through
tailored
offers
or
personalized product recommendations could
significantly reduce churn rates.
5. Comparison of Algorithms
While XGBoost performed best overall, other
models demonstrated specific strengths that may
be useful depending on the application:
I. Logistic Regression: Although it had lower
accuracy and recall, logistic regression's
interpretability makes it useful for identifying
straightforward relationships between features
and churn.
II. Decision Trees: These models provided an
intuitive way to visualize customer behavior
patterns and feature interactions, though they
tended to overfit the training data when not
controlled.
III. Random Forest: Slightly less accurate than
XGBoost, Random Forest still performed well
(85% accuracy) and offered valuable insights into
feature importance, making it a solid alternative
for use in less complex deployments.
IV. Support Vector Machines: SVM handled
imbalanced data reasonably well but struggled
with large feature sets, which reduced its overall
performance in this context.
V. Neural Networks: While neural networks
captured
complex
relationships,
their
interpretability was limited. They were also
computationally expensive compared to tree-
based models like XGBoost and Random Forest.
6. AUC-ROC and Churn Probability Calibration
The ROC curves and AUC values provided
additional insights into the performance of our
models. XGBoost’s AUC
-ROC score of 0.90
demonstrated its superior ability to distinguish
between churned and non-churned customers.
This was particularly important in fintech
applications, where false positives (misclassifying
a retained customer as at risk of churn) can lead to
unnecessary retention efforts and costs.
7. Key Insights for Retention Strategies
The results from our churn prediction models
directly inform our customer retention strategies.
By identifying key churn drivers, we can now
segment customers based on their churn risk and
engagement patterns. For example:
• High
-Risk Customers: Customers identified as
having high churn probabilities can be targeted
with personalized retention campaigns, such as
special offers or priority customer support.
• Medium
-Risk Customers: For customers showing
early signs of churn (e.g., reduced app usage), we
can deploy re-engagement strategies, such as
personalized notifications or loyalty rewards.
• Low
-Risk Customers: Retained customers with
high engagement levels can be rewarded through
loyalty programs to encourage continued usage.
CONCLUSION AND DISCUSSION
In this study, we explored the development and
implementation of machine learning-driven
customer retention strategies within the fintech
sector, specifically focusing on churn prediction.
By employing various algorithms, including
Gradient Boosting Machines (XGBoost), we
identified critical factors influencing customer
retention and developed targeted strategies to
mitigate churn risks. Our research underscores the
potential of advanced analytics in transforming
customer engagement practices, leading to
improved customer loyalty and enhanced business
outcomes in fintech.
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
161
https://www.theamericanjournals.com/index.php/tajet
Our findings revealed that customer inactivity,
engagement metrics, customer lifetime value
(CLV), subscription type, and demographic factors
are paramount in predicting churn. Specifically,
customers with lower engagement levels or longer
periods of inactivity are significantly more likely to
discontinue their services. This insight enables
fintech organizations to prioritize their retention
efforts effectively, focusing on high-value
customers showing signs of disengagement.
The performance evaluation of different machine
learning models demonstrated that XGBoost
outperformed its counterparts across multiple
metrics, including accuracy, precision, recall, and
AUC-ROC. This highlights not only the importance
of selecting robust algorithms but also the
necessity of feature importance analysis to
understand customer behavior intricately. Such
insights can drive personalized retention
strategies, offering tailored solutions that cater to
individual customer needs and preferences.
The practical implications of our findings are
manifold. Fintech companies can leverage the
predictive capabilities of machine learning models
to create real-time customer engagement
strategies. By implementing automated systems
that trigger interventions based on churn
predictions, organizations can enhance customer
experiences and prevent potential losses. For
example, targeted retention campaigns for high-
risk customers can help maintain their
engagement, while incentives for medium-risk
customers can serve as re-engagement tools.
Additionally, our results suggest the need for an
adaptive approach to customer retention, where
models are routinely updated based on new data
and changing customer behaviors. By integrating
feedback mechanisms and employing adaptive
learning models, fintech companies can remain
responsive to evolving market conditions and
customer preferences, further refining their
retention strategies.
Limitations and Future Research
Despite the strengths of our study, it is essential to
acknowledge its limitations. The reliance on
historical data can constrain the model's ability to
adapt to rapidly changing customer behaviors and
market dynamics. Moreover, while our analysis
highlighted key factors affecting churn, it may not
encompass all possible influences, such as
macroeconomic factors or changes in regulatory
landscapes.
Future research should explore the integration of
real-time data analytics and more nuanced
customer insights, such as sentiment analysis
derived
from customer
interactions.
By
incorporating diverse data sources and refining
modeling techniques, subsequent studies can
enhance the accuracy of churn predictions and
develop more comprehensive retention strategies.
In conclusion, our research demonstrates that
machine learning offers powerful tools for
predicting customer churn and developing
actionable retention strategies in the fintech
industry. By understanding the critical factors
influencing
customer
behavior,
fintech
organizations can adopt a proactive stance toward
customer engagement, ultimately fostering loyalty
and driving profitability. As the fintech landscape
continues to evolve, the adoption of advanced
analytics will play a pivotal role in shaping the
future of customer relationship management,
ensuring that businesses remain competitive and
responsive to their customers' needs.
Acknowledgement
: All the author contributed
equally
REFERENCE
1.
Modak, C., Ghosh, S. K., Sarkar, M. A. I., Sharif, M.
K., Arif, M., Bhuiyan, M., ... & Devi, S. (2024).
Machine Learning Model in Digital Marketing
Strategies for Customer Behavior: Harnessing
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
162
https://www.theamericanjournals.com/index.php/tajet
CNNs for Enhanced Customer Satisfaction and
Strategic
Decision-Making.
Journal
of
Economics, Finance and Accounting Studies,
6(3), 178-186.
2.
Shahid, R., Mozumder, M. A. S., Sweet, M. M. R.,
Hasan, M., Alam, M., Rahman, M. A., ... & Islam,
M. R. (2024). Predicting Customer Loyalty in
the Airline Industry: A Machine Learning
Approach Integrating Sentiment Analysis and
User Experience. International Journal on
Computational Engineering, 1(2), 50-54.
3.
Chowdhury, M. S., Shak, M. S., Devi, S., Miah, M.
R., Al Mamun, A., Ahmed, E., ... & Mozumder, M.
S. A. (2024). Optimizing E-Commerce Pricing
Strategies: A Comparative Analysis of Machine
Learning Models for Predicting Customer
Satisfaction. The American Journal of
Engineering and Technology, 6(09), 6-17.
4.
Md Abu Sayed, Badruddowza, Md Shohail
Uddin Sarker, Abdullah Al Mamun, Norun Nabi,
Fuad Mahmud, Md Khorshed Alam, Md Tarek
Hasan, Md Rashed Buiya, & Mashaeikh Zaman
Md.
Eftakhar
Choudhury.
(2024).
COMPARATIVE ANALYSIS OF MACHINE
LEARNING ALGORITHMS FOR PREDICTING
CYBERSECURITY ATTACK SUCCESS: A
PERFORMANCE EVALUATION. The American
Journal of Engineering and Technology, 6(09),
81
–
91.
https://doi.org/10.37547/tajet/Volume06Iss
ue09-10
5.
Md Al-Imran, Salma Akter, Md Abu Sufian
Mozumder, Rowsan Jahan Bhuiyan, Tauhedur
Rahman, Md Jamil Ahmmed, Md Nazmul
Hossain Mir, Md Amit Hasan, Ashim Chandra
Das, & Md. Emran Hossen. (2024).
EVALUATING
MACHINE
LEARNING
ALGORITHMS
FOR
BREAST
CANCER
DETECTION: A STUDY ON ACCURACY AND
PREDICTIVE PERFORMANCE. The American
Journal of Engineering and Technology, 6(09),
22
–
33.
https://doi.org/10.37547/tajet/Volume06Iss
ue09-04
6.
Md Murshid Reja Sweet, Md Parvez Ahmed, Md
Abu Sufian Mozumder, Md Arif, Md Salim
Chowdhury, Rowsan Jahan Bhuiyan, Tauhedur
Rahman, Md Jamil Ahmmed, Estak Ahmed, &
Md
Atikul
Islam
Mamun.
(2024).
COMPARATIVE ANALYSIS OF MACHINE
LEARNING TECHNIQUES FOR ACCURATE
LUNG CANCER PREDICTION. The American
Journal of Engineering and Technology, 6(09),
92
–
103.
https://doi.org/10.37547/tajet/Volume06Iss
ue09-11
7.
Bahl, S., Kumar, P., & Agarwal, A. (2021).
Sentiment analysis in banking services: A
review of techniques and challenges.
International
Journal
of
Information
Management, 57, 102317.
8.
Ashim Chandra Das, Md Shahin Alam
Mozumder, Md Amit Hasan, Maniruzzaman
Bhuiyan, Md Rasibul Islam, Md Nur Hossain,
Salma Akter, & Md Imdadul Alam. (2024).
MACHINE LEARNING APPROACHES FOR
DEMAND FORECASTING: THE IMPACT OF
CUSTOMER SATISFACTION ON PREDICTION
ACCURACY. The American Journal of
Engineering and Technology, 6(10), 42
–
53.
https://doi.org/10.37547/tajet/Volume06Iss
ue10-06
9.
Rowsan Jahan Bhuiyan, Salma Akter, Aftab
Uddin, Md Shujan Shak, Md Rasibul Islam, S M
Shadul Islam Rishad, Farzana Sultana, & Md.
Hasan-Or-Rashid.
(2024).
SENTIMENT
ANALYSIS OF CUSTOMER FEEDBACK IN THE
BANKING SECTOR: A COMPARATIVE STUDY
OF MACHINE LEARNING MODELS. The
American Journal of Engineering and
Technology,
6(10),
54
–
66.
https://doi.org/10.37547/tajet/Volume06Iss
THE USA JOURNALS
THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN
–
2689-0984)
VOLUME 06 ISSUE10
163
https://www.theamericanjournals.com/index.php/tajet
ue10-07
10.
Awoyemi, J. O., Adetunmbi, A. O., & Oluwadare,
S. A. (2017). Credit card fraud detection using
machine learning techniques: A comparative
analysis. Journal of Applied Security Research,
12(4),
1
–
14.
https://doi.org/10.1080/19361610.2017.131
5696
11.
Bhowmik, D. (2019). Detecting financial fraud
using
machine
learning
techniques.
International Journal of Data Science, 6(2),
102-121.
https://doi.org/10.1080/25775327.2019.112
3126
12.
Dal Pozzolo, A., Boracchi, G., Caelen, O., Alippi,
C., & Bontempi, G. (2015). Credit card fraud
detection: A realistic modeling and a novel
13.
Bashir, A., Yousaf, M., & Awan, M. U. (2021).
Customer churn prediction in mobile banking:
A machine learning approach. International
Journal of Information Management, 58,
102303.
https://doi.org/10.1016/j.ijinfomgt.2021.102
303
14.
Chen, Y., Chen, H., & Liu, Y. (2021). The impact
of customer satisfaction on customer retention
in the fintech industry: A case study. Journal of
Financial Services Marketing, 26(2), 85-97.
https://doi.org/10.1057/s41264-021-00110-
7
15.
Dewan, S., Wu, D. J., & Trivedi, R. (2020).
Customer engagement in fintech: A study of
factors affecting retention. Journal of Banking
and
Finance,
118,
105888.
https://doi.org/10.1016/j.jbankfin.2020.1058
88
16.
Li, J., Liu, X., & Zhang, Y. (2021). Predicting
customer churn in mobile banking: A case
study of a Chinese fintech firm. Journal of
Retailing and Consumer Services, 59, 102393.
https://doi.org/10.1016/j.jretconser.2021.10
2393
17.
Ransom, C. (2021). The cost of customer
acquisition versus retention in fintech.
Harvard Business Review. Retrieved from
https://hbr.org/2021/03/the-cost-of-
customer-acquisition-versus-retention-in-
fintech
18.
Wang, Q., Zhang, H., & Liu, S. (2022). A gradient
boosting approach for customer churn
prediction in digital financial services. Expert
Systems with Applications, 208, 118267.
https://doi.org/10.1016/j.eswa.2022.118267