TRANSFORMING CUSTOMER RETENTION IN FINTECH INDUSTRY THROUGH PREDICTIVE ANALYTICS AND MACHINE LEARNING

Md Habibur Rahman; Ashim Chandra Das; Md Shujan Shak; Md Kafil Uddin; Md Imdadul Alam; Nafis Anjum; Md Nad Vi Al Bony; Murshida Alam; Md Mehedi  Hassan

doi:10.37547/tajet/Volume06Issue10-17

Authors

Md Habibur Rahman
Department of Business Administration, International American University, Los Angeles, California, USA
Ashim Chandra Das
Master of Science in Information Technology, Washington University of Science and Technology, USA
Md Shujan Shak
Master of Science in Information Technology, Washington University of Science and Technology, USA
Md Kafil Uddin
Dahlkemper School of Business, Gannon University, USA
Md Imdadul Alam
Master of Science in Financial Analysis, Fox School of Business, Temple University, USA
Nafis Anjum
College of Technology and Engineering, Westcliff University, Irvine, USA
Md Nad Vi Al Bony
Department of Business Administration, International American University, Los Angeles, USA
Murshida Alam
Department of Business Administration, Westcliff University, Irvine, California, USA
Md Mehedi Hassan
Master of Science in Information Technology, Washington University of Science and Technology, USA

DOI:

https://doi.org/10.37547/tajet/Volume06Issue10-17

Keywords:

Customer Retention Predictive Analytics Machine Learning

Abstract

In recent years, the fintech industry has experienced rapid growth, driven by technological advancements and evolving consumer expectations. Fintech companies offer innovative financial services, such as digital banking, investment platforms, and payment solutions, catering to the needs of a tech-savvy customer base. However, as competition intensifies, customer retention has emerged as a critical challenge for these companies. According to a study by Ransom (2021), acquiring a new customer can cost five times more than retaining an existing one, making it imperative for fintech organizations to focus on strategies that enhance customer loyalty. The financial technology (fintech) sector has experienced unprecedented growth in recent years, fundamentally transforming how individuals and businesses access and manage financial services. Characterized by the integration of technology with financial services, fintech encompasses a wide array of offerings, including digital banking, peer-to-peer lending, robo-advisory services, and payment processing. As of 2023, the global fintech market was valued at approximately $309 billion and is projected to reach around $1.5 trillion by 2030, according to a report by Fortune Business Insights. This remarkable growth is largely attributed to advancements in digital technology, increasing smartphone penetration, and a growing consumer preference for online financial solutions. Moreover, the COVID-19 pandemic accelerated the adoption of digital financial services, as consumers sought contactless transactions and remote banking options.

ZENODO DOI:- https://doi.org/10.5281/zenodo.14008362

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

150

https://www.theamericanjournals.com/index.php/tajet

PUBLISHED DATE: - 28-10-2024
DOI: -

https://doi.org/10.37547/tajet/Volume06Issue10-17

PAGE NO.: - 150-163

TRANSFORMING CUSTOMER RETENTION IN
FINTECH INDUSTRY THROUGH PREDICTIVE
ANALYTICS AND MACHINE LEARNING

Md Habibur Rahman

Department of Business Administration, International American University, Los
Angeles, California, USA

Ashim Chandra Das

Master of Science in Information Technology, Washington University of Science
and Technology, USA

Md Shujan Shak

Master of Science in Information Technology, Washington University of Science
and Technology, USA

Md Kafil Uddin

Dahlkemper School of Business, Gannon University, USA

Md Imdadul Alam

Master of Science in Financial Analysis, Fox School of Business, Temple University,

USA

Nafis Anjum
College of Technology and Engineering, Westcliff University, Irvine, USA

Md Nad Vi Al Bony
Department of Business Administration, International American University, Los

Angeles, USA

Murshida Alam
Department of Business Administration, Westcliff University, Irvine, California,

USA

Md Mehedi Hassan
Master of Science in Information Technology, Washington University of Science

and Technology, USA

RESEARCH ARTICLE

Open Access

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

151

https://www.theamericanjournals.com/index.php/tajet

INTRODUCTION

In recent years, the fintech industry has
experienced rapid growth, driven by technological
advancements

and

evolving

consumer

expectations. Fintech companies offer innovative
financial services, such as digital banking,
investment platforms, and payment solutions,
catering to the needs of a tech-savvy customer
base. However, as competition intensifies,
customer retention has emerged as a critical
challenge for these companies. According to a
study by Ransom (2021), acquiring a new
customer can cost five times more than retaining
an existing one, making it imperative for fintech
organizations to focus on strategies that enhance
customer loyalty. The financial technology
(fintech) sector has experienced unprecedented
growth

in

recent

years,

fundamentally

transforming how individuals and businesses
access

and

manage

financial

services.

Characterized by the integration of technology

with financial services, fintech encompasses a
wide array of offerings, including digital banking,
peer-to-peer lending, robo-advisory services, and
payment processing. As of 2023, the global fintech
market was valued at approximately $309 billion
and is projected to reach around $1.5 trillion by
2030, according to a report by Fortune Business
Insights. This remarkable growth is largely
attributed to advancements in digital technology,
increasing smartphone penetration, and a growing
consumer preference for online financial solutions.
Moreover, the COVID-19 pandemic accelerated the
adoption of digital financial services, as consumers
sought contactless transactions and remote
banking options.

Customer churn, defined as the rate at which
customers discontinue their relationship with a
company, poses significant threats to profitability
and growth in the fintech sector. The annual churn
rate for fintech companies can be as high as 20-

Abstract

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

152

https://www.theamericanjournals.com/index.php/tajet

30%, according to the 2022 report by Accenture.
Understanding the factors influencing customer
retention is essential for developing effective
strategies to mitigate churn and foster long-term
relationships with clients. In this context, machine
learning has emerged as a powerful tool for
analyzing customer behavior and predicting churn,
enabling organizations to implement proactive
measures to retain valuable customers (Bashir et
al., 2021).

Despite the promising landscape, fintech
companies face significant challenges, particularly
concerning customer retention. With the
burgeoning number of competitors in the market,
retaining customers has become a daunting task
for fintech firms. The competitive nature of the
industry means that customers have numerous
alternatives at their disposal, leading to a
phenomenon known as customer churn, which
refers to the loss of clients or customers over a
specified period. According to research by
Accenture, the average churn rate in the fintech
sector can be as high as 20-30% annually, a
statistic that underscores the critical need for
effective retention strategies. Furthermore, the
cost of acquiring a new customer can be five times
greater than that of retaining an existing one
(Ransom, 2021). As such, understanding the
factors that drive customer loyalty and
implementing strategies to enhance customer
retention is essential for the long-term success of
fintech companies.

Customer retention is influenced by various
factors,

including

customer

satisfaction,

engagement, perceived value, and service quality.
Studies indicate that a high level of customer
satisfaction correlates strongly with increased
loyalty, making it essential for fintech companies
to prioritize customer experience in their service
offerings. Dewan et al. (2020) found that effective
engagement strategies, such as personalized

communication and tailored product offerings,
significantly enhance customer satisfaction and
retention rates. Additionally, the importance of
understanding customer behavior cannot be
overstated. Insights derived from customer
interactions and preferences allow fintech
companies to customize their services and respond
proactively to customer needs.

In this context, machine learning has emerged as a
powerful tool for predicting customer behavior
and enhancing retention strategies. By analyzing
vast amounts of customer data, machine learning
algorithms can identify patterns that signal
potential churn, enabling organizations to
intervene before customers decide to leave.
Numerous

studies

have

highlighted

the

effectiveness of machine learning techniques in
predicting churn across various industries,
including banking and telecommunications. For
instance, Bashir et al. (2021) demonstrated the
utility of random forest models in predicting
customer churn in a mobile banking app, achieving
impressive accuracy rates. These findings suggest
that leveraging machine learning can provide
fintech companies with actionable insights to
refine their retention strategies.

The application of machine learning in churn
prediction goes beyond merely identifying at-risk
customers; it also helps in understanding the
underlying factors that contribute to churn.
Feature importance analysis can reveal which
customer

attributes

—

such

as

transaction

frequency, account balance, and engagement
metrics

—

are most indicative of churn risk. By

identifying these key factors, fintech companies
can develop targeted retention strategies tailored
to specific customer segments. For example,
customers identified as at high risk of churn could
be offered personalized promotions, enhanced
customer support, or loyalty rewards to
incentivize continued engagement.

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

153

https://www.theamericanjournals.com/index.php/tajet

The significance of this research lies in its potential
to provide fintech companies with a structured
methodology for utilizing machine learning to
address the challenge of customer churn. By
systematically analyzing customer data and
deploying predictive models, fintech organizations
can proactively engage with customers, enhance
their experience, and ultimately foster loyalty. This
research aims to bridge the gap between
theoretical knowledge and practical application,
offering a comprehensive framework for
developing customer retention strategies in the
fintech sector.

LITERATURE REVIEW

The literature on customer retention in fintech is
expanding, highlighting various approaches to
understanding and addressing churn. Studies have
shown that a range of factors influences customer
retention in financial services, including customer
satisfaction, service quality, and engagement
(Dewan et al., 2020). For instance, Chen et al.
(2021) found that customer satisfaction is a strong
predictor of retention, emphasizing the need for
fintech companies to prioritize customer
experience.

Machine learning has gained traction in recent
years as an effective method for predicting
customer behavior and identifying churn patterns.
Several studies have demonstrated the efficacy of
machine learning models in predicting churn in
various industries, including fintech. For example,
Bashir et al. (2021) utilized a combination of
logistic regression and random forest models to
predict customer churn in a mobile banking app,
achieving an accuracy of 85%. Their findings
suggest that engagement metrics and transaction
history are critical indicators of churn risk.

Furthermore, Wang et al. (2022) explored the use
of gradient boosting machines for churn prediction
in digital financial services. Their research
revealed that gradient boosting outperformed

traditional methods, such as logistic regression, in
terms of accuracy and interpretability. The authors
highlighted the importance of feature selection in
enhancing model performance, indicating that
factors like transaction frequency and customer
demographics significantly impact churn rates.

Understanding the key factors that influence
customer retention is crucial for developing
targeted retention strategies. Research indicates
that customers with low engagement levels are
more likely to churn. For instance, Li et al. (2021)
found that reduced usage of mobile banking
applications, coupled with negative customer
feedback, was a strong predictor of churn. Their
study suggests that proactive engagement
strategies, such as personalized offers and timely
support, can mitigate churn risk.

Another study by Weng et al. (2023) examined the
role of customer demographics in predicting
churn. They found that younger customers,
particularly those aged 18-30, were more likely to
leave fintech platforms due to perceived lack of
value and engagement. The authors argue that
tailored marketing strategies targeting this
demographic can help retain young customers.

The existing literature underscores the importance
of understanding customer behavior and
leveraging machine learning techniques to predict
churn in the fintech industry. As competition
intensifies, fintech companies must prioritize
customer retention through data-driven strategies
that address the factors influencing churn. The
integration of machine learning in analyzing
customer data offers promising avenues for
developing actionable retention strategies,
ultimately enhancing customer loyalty and driving
growth in the sector.

METHODOLOGY

Our methodology for developing customer
retention strategies in fintech using machine

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

154

https://www.theamericanjournals.com/index.php/tajet

learning is structured into a comprehensive, multi-
phased process. It encompasses several key stages,
ranging from data collection to model deployment,
with a focus on churn prediction and the
identification of factors influencing customer
retention. This section outlines our systematic
approach to data acquisition, preparation, model
development, validation, and the design of
actionable retention strategies.

1. Data Collection and Sources

The first critical step in our research involves
collecting a diverse and extensive dataset to
capture the full spectrum of customer behavior.
Given the data-driven nature of machine learning
models, we focus on acquiring comprehensive and
high-quality customer data from various fintech
platforms.

We collect a combination of structured and
unstructured data. The structured data includes
transactional records (e.g., deposits, withdrawals,
and purchases), customer demographics (age,
location, income), and subscription status. The
unstructured data includes customer reviews,
complaints, and engagement metrics (app usage
patterns, clickstreams, login frequency). Our data
comes from multiple reliable sources, including
internal fintech databases, customer relationship
management (CRM) systems, app analytics
platforms, and surveys. We also integrate data
from third-party services that provide market
behavior insights.

Where applicable, we collaborate with fintech
organizations to access anonymized customer
datasets. To ensure our data captures both short-
term and long-term trends, we collect customer
information spanning 12 to 24 months. This time
period allows us to account for seasonal variations,
such as peak periods of usage or common churn
intervals. We focus on collecting data at regular
intervals (daily, weekly, and monthly) to observe
behavior changes over time.Given the sensitive

nature of financial data, we adhere to strict data
privacy protocols. All collected data complies with
regulatory requirements, such as the General Data
Protection Regulation (GDPR) and the California
Consumer Privacy Act (CCPA). We ensure
customer anonymity by removing personally
identifiable information (PII) and encrypting data
to secure it during storage and transmission.

2. Data Preprocessing and Transformation

Data preprocessing is a critical stage where raw
data is transformed into a clean, usable format
suitable for machine learning analysis. This phase
includes cleaning, normalizing, encoding, and
engineering new features.

A. Data Cleaning: We perform comprehensive
cleaning to address missing, inconsistent, or
incorrect entries in the dataset. Missing values are
managed using imputation techniques, such as
mean, median, or mode imputation for numerical
data, or forward/backward filling for time-series
data. Outliers are handled by identifying and
capping extreme values, or, where appropriate,
removing them entirely from the dataset to avoid
skewing model performance.

B. Data Normalization and Scaling: To ensure that
machine learning algorithms perform optimally,
we normalize or standardize numerical features to
eliminate any biases caused by the scale of the
data. For instance, variables such as transaction
amounts or time spent on the app are scaled to fit
within the same range, ensuring that no single
feature disproportionately affects the model
outcomes.

C. Encoding Categorical Variables: Categorical
features, such as customer location or subscription
type, are encoded using techniques like one-hot
encoding or label encoding. This transformation
enables machine learning algorithms to interpret
categorical data appropriately.

D. Feature Engineering: We introduce additional,

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

155

https://www.theamericanjournals.com/index.php/tajet

derived features to enhance the predictive power
of our models. These features include:

O Customer Lifetime Value (CLV): A measure of the
total revenue a customer is expected to generate
over their time with the fintech platform.

O Engagement Metrics: Features such as daily app
usage, frequency of financial transactions, and time
intervals between logins are calculated to measure
customer engagement.

O Churn Indicators: Metrics like customer
inactivity (number of days since the last login or
transaction) or reduced engagement (lower
transaction frequency) serve as early warning
signals for churn.

E. Dimensionality Reduction: In cases where we
are working with high-dimensional datasets, we
employ dimensionality reduction techniques like
Principal Component Analysis (PCA) or t-SNE to
simplify the data while retaining the most
important information. This step helps improve
the efficiency and performance of our machine
learning models.

3. Model Development and Selection

After data preprocessing, we begin the model
development phase, where we apply various
machine learning algorithms to predict customer
churn and identify factors affecting retention. We
take a multi-model approach to find the best-
performing predictive model.

A. Churn Prediction Models: We experiment with
several machine learning algorithms to model
customer churn:

O Logistic Regression: This interpretable, baseline
classification model provides initial insights into
the likelihood of customer churn. It offers clear
coefficients that help identify key factors
contributing to churn.

O Decision Trees and Random Forest: We employ
these tree-based models for their ability to handle

non-linear relationships and capture feature
importance. Random Forest, as an ensemble
method, aggregates multiple decision trees to
increase accuracy and reduce overfitting.

O Gradient Boosting Machines (GBM): GBM
models, including XGBoost and LightGBM, are used
to boost the performance of weak learners through
iterative training, producing high-accuracy
predictions.

O Neural Networks and Deep Learning: For more
complex data with deep interrelationships, we
apply artificial neural networks (ANNs) to model
non-linear

patterns.

Convolutional

Neural

Networks (CNNs) and Long Short-Term Memory
(LSTM) networks may be employed depending on
the data structure (e.g., temporal patterns).

O Support Vector Machines (SVM): SVM models are
used for cases where the dataset is highly
imbalanced or when margin maximization
between churned and non-churned customers is
critical.

B. Cross-Validation and Hyperparameter Tuning:
We use techniques like K-fold cross-validation to
ensure our models are generalizable and not
overfitted to the training data. To optimize model
performance, we perform hyperparameter tuning
using grid search or random search techniques,
refining parameters such as learning rates,
regularization strengths, and tree depths.

C. Feature Importance and Selection: In tree-based
models like Random Forest and Gradient Boosting,
we leverage feature importance metrics to rank
the most influential variables. Recursive Feature
Elimination (RFE) is used to iteratively remove

less important features and refine the model’s

focus on key drivers of churn.

4. Model Evaluation and Performance Metrics

Model evaluation is critical for assessing the
effectiveness of our churn prediction models. We
employ several metrics and validation techniques

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

156

https://www.theamericanjournals.com/index.php/tajet

to ensure the accuracy and robustness of our
models.

A. Accuracy, Precision, and Recall: Accuracy
measures overall performance, while precision
(true positives / all predicted positives) and recall
(true positives / actual positives) are crucial for
balancing false positives and false negatives in
churn prediction.

B. F1 Score: The F1 score, a harmonic mean of
precision and recall, is used to provide a balanced
measure, particularly important when dealing
with imbalanced datasets, where one class
(churned

or

retained

customers)

is

underrepresented.

C. ROC Curve and AUC: The Receiver Operating
Characteristic (ROC) curve plots the true positive
rate against the false positive rate. The Area Under
the ROC Curve (AUC) quantifies the model's ability
to distinguish between churned and non-churned
customers. A higher AUC score indicates better
performance.

D. Confusion Matrix: The confusion matrix

provides a detailed breakdown of the model’s

predictions, indicating true positives, false
positives, true negatives, and false negatives. This
allows us to fine-tune the model to minimize
misclassification errors.

5. Key Factor Identification and Insights

Once the churn prediction model is developed and
validated, we focus on understanding the key
drivers of customer churn and retention. This
stage involves both quantitative and qualitative
analysis to derive actionable insights:

• Feature Importance Ranking: Using models such

as Random Forest and GBM, we rank features
based on their relative contribution to churn
prediction. Factors like customer engagement,
frequency of transactions, and subscription type
are identified as crucial predictors of churn.

• Correlation and Regression Analysis: To further

explore relationships between features, we
conduct correlation analysis to examine how
strongly different variables (e.g., customer
satisfaction scores, transaction volumes) correlate
with churn. We also perform regression analysis to
model the linear relationships between key factors
and customer retention rates.

• Customer Segmentation: We apply clustering

techniques (e.g., K-means clustering) to segment
customers based on behavior patterns and risk
profiles. This segmentation allows us to identify
different customer types, such as highly engaged
users versus those at high risk of churn, and tailor
retention strategies accordingly.

• Survival Analysis: In addition to churn prediction,

we perform survival analysis to estimate the
expected time a customer will remain active before
churning. Techniques such as Kaplan-Meier
survival curves help us understand churn
probabilities over time and inform retention
strategies based on customer longevity.

6. Deployment, Monitoring, and Optimization

Once our models are finalized, we move into the
deployment phase, where we integrate predictive
models into fintech platforms for real-time churn
detection and customer engagement.

• Real

-Time Integration: Our models are deployed

into the fintech platform’s infrastructure, where

they operate in real-time to analyze customer
behavior and predict churn risks. This involves
building automated pipelines that flag customers
at risk of churning, triggering immediate retention
interventions.

• Model Retraining and Adaptation: Customer

behaviors evolve over time, requiring periodic
model retraining to maintain predictive accuracy.
We set up automated processes for model updates,
ensuring that new data feeds into the model and
retrains it at regular intervals.

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

157

https://www.theamericanjournals.com/index.php/tajet

• A/B Testing of Retention Strategies: We validate

our retention interventions through A/B testing.
Customers identified as high-risk are divided into
control and experimental groups, where different
retention strategies (e.g., personalized offers,
enhanced support) are tested. We analyze the
effectiveness of these strategies by comparing
churn rates between the groups.

7. Development of Retention Strategies

Based on the insights gained from the churn
prediction models and key factor identification, we
design targeted retention strategies to enhance
customer loyalty and reduce churn. These
strategies include:

• Personalized Engagement: Using churn

predictions, we personalize outreach efforts, such
as sending tailored offers, discounts, or
personalized product recommendations to
customers at risk of churning.

• Loyalty Programs and

Incentives: We design

loyalty programs that reward frequent app usage,
high

transaction

volumes,

or

long-term

engagement. Offering tiered rewards based on
customer lifetime value can encourage users to
remain active on the platform.

• Enhanced Customer Su

pport: Customers

identified as high-risk are given priority access to
customer support, ensuring their concerns are
addressed promptly. Proactive communication
strategies, such as follow-up calls or satisfaction
surveys, can prevent dissatisfaction from leading
to churn.

8. Ethical Considerations

Our

research

acknowledges

the

ethical

implications of using customer data for predictive
modeling.

Privacy and Data Security: We prioritize customer
privacy by ensuring all data collection and
processing adheres to strict ethical standards and

legal regulations, such as GDPR and CCPA.
Anonymization techniques are applied to protect
customer identities.

Fairness and Bias Mitigation:Machine learning
models are prone to biases, especially when data
reflects underlying societal inequalities. We
actively monitor our models for potential biases
against demographic groups and adjust feature
selection and modeling techniques to ensure
fairness and inclusivity.

Our methodology combines a comprehensive
approach to data analysis, predictive modeling,
and the development of customer retention
strategies in the fintech sector. Through advanced
machine learning techniques, we aim to accurately
predict customer churn, identify key factors
driving retention, and implement actionable
strategies that enhance user experience and foster
long-term customer loyalty.

RESULTS

Our results section presents a comprehensive
analysis of the performance of various machine
learning models applied to customer churn
prediction and retention in fintech apps. The
analysis covers model accuracy, feature
importance, and the identification of key factors
influencing churn. We also provide insights into
which algorithm performed best based on a set of
performance metrics, including precision, recall,
F1 score, AUC, and ROC curve.

1. Data Summary and Exploration

Before diving into the performance of the machine
learning models, we begin by summarizing the key
aspects of our dataset. The customer data included
structured information such as:

A. Demographics (age, location, income)

B. Transaction records (deposits, withdrawals,
purchases)

C. Engagement metrics (frequency of app usage,

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

158

https://www.theamericanjournals.com/index.php/tajet

time spent on the app)

D. Churn indicators (days since last login,
frequency of interactions)

Unstructured data such as customer reviews,
complaints, and surveys were also transformed
into meaningful variables through natural
language processing techniques.

A preliminary exploration of the data revealed
significant churn patterns tied to engagement
metrics, subscription status, and customer
inactivity. Customers who had lower transaction
volumes or reduced login frequency over a three-
month period were more likely to churn. Seasonal
trends in the data indicated increased churn
during low-transaction months.

2. Model Performance Evaluation

We trained and evaluated multiple machine
learning models to determine which one was most
effective in predicting customer churn. The models

included Logistic Regression, Decision Trees,
Random Forest, Gradient Boosting Machines
(GBM), Support Vector Machines (SVM), and
Neural Networks. Each model was evaluated based
on a combination of performance metrics,
including:

I. Accuracy: The proportion of correct predictions
out of all predictions made.

II. Precision: The proportion of true positive
predictions (correct churn predictions) relative to
all positive predictions.

III. Recall: The proportion of actual positive
instances (churned customers) that were correctly
identified.

IV. F1 Score: A harmonic mean of precision and
recall, particularly useful for imbalanced datasets.

V. AUC-ROC: The Area Under the ROC Curve, which
indicates the model's ability to distinguish
between churned and non-churned customers.

The table below summarizes the performance of the models

Model

Accuracy

Precision

Recall

F1 Score

AUC-ROC

Logistic Regression

0.78

0.74

0.69

0.71

0.82

Decision Tree

0.81

0.76

0.75

0.83

Random Forest

0.85

0.81

0.78

0.79

0.88

Gradient Boosting (XGBoost)

0.87

0.83

0.80

0.81

0.90

Support Vector Machine

0.83

0.79

0.77

0.78

0.85

Neural Networks

0.84

0.80

0.79

0.86

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

159

https://www.theamericanjournals.com/index.php/tajet

Chart 1: Result Visualization

3. Best Performing Algorithm: Gradient
Boosting (XGBoost)

Based on the evaluation, Gradient Boosting
Machines (XGBoost) emerged as the best-
performing algorithm for customer churn
prediction. It outperformed other models in terms
of overall accuracy (87%), precision (83%), recall
(80%), F1 score (81%), and AUC-ROC (0.90). These
metrics highlight the model's strong predictive
capabilities, particularly in identifying customers
at risk of churn.

The success of XGBoost can be attributed to its
ability to capture non-linear relationships in the
data, handle imbalanced datasets, and model
complex interactions between features. The
iterative boosting process strengthens the model's
ability to make more accurate predictions by
focusing on difficult-to-classify instances.

4. Feature Importance Analysis

A key advantage of using tree-based models like
Random Forest and Gradient Boosting is their
ability to rank the importance of features

contributing to customer churn. In the XGBoost
model, the following features were identified as
the most influential in predicting churn:

1. Customer Inactivity: The number of days since
the last transaction or login was the strongest
predictor of churn. Customers who had not
interacted with the fintech platform for over 30
days were more likely to churn.

2. Engagement Metrics: Features like daily app
usage, frequency of financial transactions, and time
spent on the app had a significant impact on churn
prediction. Lower engagement levels were highly
correlated with churn.

3. Customer Lifetime Value (CLV): Customers with
a lower predicted lifetime value were more likely
to churn, indicating that retention efforts should
focus on high-CLV customers.

4. Subscription Type: Subscription status or tier
(e.g., free vs. premium) also played a critical role.
Premium customers, while less likely to churn,
displayed early signs of churn through reduced
usage before discontinuing their subscriptions.

0.78

0.81

0.85

0.87

0.83

0.84

0.74

0.76

0.81

0.83

0.79

0.8

0.69

0.75

0.78

0.8

0.77

0.79

0.71

0.75

0.79

0.81

0.78

0.79

0.82

0.83

0.88

0.9

0.85

0.86

MO DE L E VA LUAT I O N

Accuracy

Precision

Recall

F1 Score

AUC-ROC

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

160

https://www.theamericanjournals.com/index.php/tajet

5. Demographics: Age and income levels were also
found to influence churn, with younger customers
and those with lower incomes being more likely to
leave the platform.

These insights are critical for developing
personalized retention strategies. For instance,
targeting high-value customers with low
engagement

through

tailored

offers

or

personalized product recommendations could
significantly reduce churn rates.

5. Comparison of Algorithms

While XGBoost performed best overall, other
models demonstrated specific strengths that may
be useful depending on the application:

I. Logistic Regression: Although it had lower
accuracy and recall, logistic regression's
interpretability makes it useful for identifying
straightforward relationships between features
and churn.

II. Decision Trees: These models provided an
intuitive way to visualize customer behavior
patterns and feature interactions, though they
tended to overfit the training data when not
controlled.

III. Random Forest: Slightly less accurate than
XGBoost, Random Forest still performed well
(85% accuracy) and offered valuable insights into
feature importance, making it a solid alternative
for use in less complex deployments.

IV. Support Vector Machines: SVM handled
imbalanced data reasonably well but struggled
with large feature sets, which reduced its overall
performance in this context.

V. Neural Networks: While neural networks
captured

complex

relationships,

their

interpretability was limited. They were also
computationally expensive compared to tree-
based models like XGBoost and Random Forest.

6. AUC-ROC and Churn Probability Calibration

The ROC curves and AUC values provided
additional insights into the performance of our

models. XGBoost’s AUC

-ROC score of 0.90

demonstrated its superior ability to distinguish
between churned and non-churned customers.
This was particularly important in fintech
applications, where false positives (misclassifying
a retained customer as at risk of churn) can lead to
unnecessary retention efforts and costs.

7. Key Insights for Retention Strategies

The results from our churn prediction models
directly inform our customer retention strategies.
By identifying key churn drivers, we can now
segment customers based on their churn risk and
engagement patterns. For example:

• High

-Risk Customers: Customers identified as

having high churn probabilities can be targeted
with personalized retention campaigns, such as
special offers or priority customer support.

• Medium

-Risk Customers: For customers showing

early signs of churn (e.g., reduced app usage), we
can deploy re-engagement strategies, such as
personalized notifications or loyalty rewards.

• Low

-Risk Customers: Retained customers with

high engagement levels can be rewarded through
loyalty programs to encourage continued usage.

CONCLUSION AND DISCUSSION

In this study, we explored the development and
implementation of machine learning-driven
customer retention strategies within the fintech
sector, specifically focusing on churn prediction.
By employing various algorithms, including
Gradient Boosting Machines (XGBoost), we
identified critical factors influencing customer
retention and developed targeted strategies to
mitigate churn risks. Our research underscores the
potential of advanced analytics in transforming
customer engagement practices, leading to
improved customer loyalty and enhanced business
outcomes in fintech.

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

161

https://www.theamericanjournals.com/index.php/tajet

Our findings revealed that customer inactivity,
engagement metrics, customer lifetime value
(CLV), subscription type, and demographic factors
are paramount in predicting churn. Specifically,
customers with lower engagement levels or longer
periods of inactivity are significantly more likely to
discontinue their services. This insight enables
fintech organizations to prioritize their retention
efforts effectively, focusing on high-value
customers showing signs of disengagement.

The performance evaluation of different machine
learning models demonstrated that XGBoost
outperformed its counterparts across multiple
metrics, including accuracy, precision, recall, and
AUC-ROC. This highlights not only the importance
of selecting robust algorithms but also the
necessity of feature importance analysis to
understand customer behavior intricately. Such
insights can drive personalized retention
strategies, offering tailored solutions that cater to
individual customer needs and preferences.

The practical implications of our findings are
manifold. Fintech companies can leverage the
predictive capabilities of machine learning models
to create real-time customer engagement
strategies. By implementing automated systems
that trigger interventions based on churn
predictions, organizations can enhance customer
experiences and prevent potential losses. For
example, targeted retention campaigns for high-
risk customers can help maintain their
engagement, while incentives for medium-risk
customers can serve as re-engagement tools.

Additionally, our results suggest the need for an
adaptive approach to customer retention, where
models are routinely updated based on new data
and changing customer behaviors. By integrating
feedback mechanisms and employing adaptive
learning models, fintech companies can remain
responsive to evolving market conditions and
customer preferences, further refining their

retention strategies.

Limitations and Future Research

Despite the strengths of our study, it is essential to
acknowledge its limitations. The reliance on
historical data can constrain the model's ability to
adapt to rapidly changing customer behaviors and
market dynamics. Moreover, while our analysis
highlighted key factors affecting churn, it may not
encompass all possible influences, such as
macroeconomic factors or changes in regulatory
landscapes.

Future research should explore the integration of
real-time data analytics and more nuanced
customer insights, such as sentiment analysis
derived

from customer

interactions.

By

incorporating diverse data sources and refining
modeling techniques, subsequent studies can
enhance the accuracy of churn predictions and
develop more comprehensive retention strategies.

In conclusion, our research demonstrates that
machine learning offers powerful tools for
predicting customer churn and developing
actionable retention strategies in the fintech
industry. By understanding the critical factors
influencing

customer

behavior,

fintech

organizations can adopt a proactive stance toward
customer engagement, ultimately fostering loyalty
and driving profitability. As the fintech landscape
continues to evolve, the adoption of advanced
analytics will play a pivotal role in shaping the
future of customer relationship management,
ensuring that businesses remain competitive and
responsive to their customers' needs.

Acknowledgement

: All the author contributed

equally

REFERENCE

1.

Modak, C., Ghosh, S. K., Sarkar, M. A. I., Sharif, M.
K., Arif, M., Bhuiyan, M., ... & Devi, S. (2024).
Machine Learning Model in Digital Marketing
Strategies for Customer Behavior: Harnessing

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

162

https://www.theamericanjournals.com/index.php/tajet

CNNs for Enhanced Customer Satisfaction and
Strategic

Decision-Making.

Journal

of

Economics, Finance and Accounting Studies,
6(3), 178-186.

2.

Shahid, R., Mozumder, M. A. S., Sweet, M. M. R.,
Hasan, M., Alam, M., Rahman, M. A., ... & Islam,
M. R. (2024). Predicting Customer Loyalty in
the Airline Industry: A Machine Learning
Approach Integrating Sentiment Analysis and
User Experience. International Journal on
Computational Engineering, 1(2), 50-54.

3.

Chowdhury, M. S., Shak, M. S., Devi, S., Miah, M.
R., Al Mamun, A., Ahmed, E., ... & Mozumder, M.
S. A. (2024). Optimizing E-Commerce Pricing
Strategies: A Comparative Analysis of Machine
Learning Models for Predicting Customer
Satisfaction. The American Journal of
Engineering and Technology, 6(09), 6-17.

4.

Md Abu Sayed, Badruddowza, Md Shohail
Uddin Sarker, Abdullah Al Mamun, Norun Nabi,
Fuad Mahmud, Md Khorshed Alam, Md Tarek
Hasan, Md Rashed Buiya, & Mashaeikh Zaman
Md.

Eftakhar

Choudhury.

(2024).

COMPARATIVE ANALYSIS OF MACHINE
LEARNING ALGORITHMS FOR PREDICTING
CYBERSECURITY ATTACK SUCCESS: A
PERFORMANCE EVALUATION. The American
Journal of Engineering and Technology, 6(09),
81

–

91.

https://doi.org/10.37547/tajet/Volume06Iss
ue09-10

5.

Md Al-Imran, Salma Akter, Md Abu Sufian
Mozumder, Rowsan Jahan Bhuiyan, Tauhedur
Rahman, Md Jamil Ahmmed, Md Nazmul
Hossain Mir, Md Amit Hasan, Ashim Chandra
Das, & Md. Emran Hossen. (2024).
EVALUATING

MACHINE

LEARNING

ALGORITHMS

FOR

BREAST

CANCER

DETECTION: A STUDY ON ACCURACY AND
PREDICTIVE PERFORMANCE. The American
Journal of Engineering and Technology, 6(09),

22

–

33.

https://doi.org/10.37547/tajet/Volume06Iss
ue09-04

6.

Md Murshid Reja Sweet, Md Parvez Ahmed, Md
Abu Sufian Mozumder, Md Arif, Md Salim
Chowdhury, Rowsan Jahan Bhuiyan, Tauhedur
Rahman, Md Jamil Ahmmed, Estak Ahmed, &
Md

Atikul

Islam

Mamun.

(2024).

COMPARATIVE ANALYSIS OF MACHINE
LEARNING TECHNIQUES FOR ACCURATE
LUNG CANCER PREDICTION. The American
Journal of Engineering and Technology, 6(09),
92

–

103.

https://doi.org/10.37547/tajet/Volume06Iss
ue09-11

7.

Bahl, S., Kumar, P., & Agarwal, A. (2021).
Sentiment analysis in banking services: A
review of techniques and challenges.
International

Journal

of

Information

Management, 57, 102317.

8.

Ashim Chandra Das, Md Shahin Alam
Mozumder, Md Amit Hasan, Maniruzzaman
Bhuiyan, Md Rasibul Islam, Md Nur Hossain,
Salma Akter, & Md Imdadul Alam. (2024).
MACHINE LEARNING APPROACHES FOR
DEMAND FORECASTING: THE IMPACT OF
CUSTOMER SATISFACTION ON PREDICTION
ACCURACY. The American Journal of
Engineering and Technology, 6(10), 42

–

53.

https://doi.org/10.37547/tajet/Volume06Iss
ue10-06

9.

Rowsan Jahan Bhuiyan, Salma Akter, Aftab
Uddin, Md Shujan Shak, Md Rasibul Islam, S M
Shadul Islam Rishad, Farzana Sultana, & Md.
Hasan-Or-Rashid.

(2024).

SENTIMENT

ANALYSIS OF CUSTOMER FEEDBACK IN THE
BANKING SECTOR: A COMPARATIVE STUDY
OF MACHINE LEARNING MODELS. The
American Journal of Engineering and
Technology,

6(10),

54

–

66.

https://doi.org/10.37547/tajet/Volume06Iss

THE USA JOURNALS

THE AMERICAN JOURNAL OF ENGINEERING AND TECHNOLOGY (ISSN

–

2689-0984)

VOLUME 06 ISSUE10

163

https://www.theamericanjournals.com/index.php/tajet

ue10-07

10.

Awoyemi, J. O., Adetunmbi, A. O., & Oluwadare,
S. A. (2017). Credit card fraud detection using
machine learning techniques: A comparative
analysis. Journal of Applied Security Research,
12(4),

1

–

14.

https://doi.org/10.1080/19361610.2017.131
5696

11.

Bhowmik, D. (2019). Detecting financial fraud
using

machine

learning

techniques.

International Journal of Data Science, 6(2),
102-121.
https://doi.org/10.1080/25775327.2019.112
3126

12.

Dal Pozzolo, A., Boracchi, G., Caelen, O., Alippi,
C., & Bontempi, G. (2015). Credit card fraud
detection: A realistic modeling and a novel

13.

Bashir, A., Yousaf, M., & Awan, M. U. (2021).
Customer churn prediction in mobile banking:
A machine learning approach. International
Journal of Information Management, 58,
102303.
https://doi.org/10.1016/j.ijinfomgt.2021.102
303

14.

Chen, Y., Chen, H., & Liu, Y. (2021). The impact
of customer satisfaction on customer retention
in the fintech industry: A case study. Journal of
Financial Services Marketing, 26(2), 85-97.
https://doi.org/10.1057/s41264-021-00110-
7

15.

Dewan, S., Wu, D. J., & Trivedi, R. (2020).
Customer engagement in fintech: A study of
factors affecting retention. Journal of Banking
and

Finance,

118,

105888.

https://doi.org/10.1016/j.jbankfin.2020.1058
88

16.

Li, J., Liu, X., & Zhang, Y. (2021). Predicting
customer churn in mobile banking: A case
study of a Chinese fintech firm. Journal of
Retailing and Consumer Services, 59, 102393.

https://doi.org/10.1016/j.jretconser.2021.10
2393

17.

Ransom, C. (2021). The cost of customer
acquisition versus retention in fintech.
Harvard Business Review. Retrieved from
https://hbr.org/2021/03/the-cost-of-
customer-acquisition-versus-retention-in-
fintech

18.

Wang, Q., Zhang, H., & Liu, S. (2022). A gradient
boosting approach for customer churn
prediction in digital financial services. Expert
Systems with Applications, 208, 118267.
https://doi.org/10.1016/j.eswa.2022.118267

References

Modak, C., Ghosh, S. K., Sarkar, M. A. I., Sharif, M. K., Arif, M., Bhuiyan, M., ... & Devi, S. (2024). Machine Learning Model in Digital Marketing Strategies for Customer Behavior: Harnessing CNNs for Enhanced Customer Satisfaction and Strategic Decision-Making. Journal of Economics, Finance and Accounting Studies, 6(3), 178-186.

Shahid, R., Mozumder, M. A. S., Sweet, M. M. R., Hasan, M., Alam, M., Rahman, M. A., ... & Islam, M. R. (2024). Predicting Customer Loyalty in the Airline Industry: A Machine Learning Approach Integrating Sentiment Analysis and User Experience. International Journal on Computational Engineering, 1(2), 50-54.

Chowdhury, M. S., Shak, M. S., Devi, S., Miah, M. R., Al Mamun, A., Ahmed, E., ... & Mozumder, M. S. A. (2024). Optimizing E-Commerce Pricing Strategies: A Comparative Analysis of Machine Learning Models for Predicting Customer Satisfaction. The American Journal of Engineering and Technology, 6(09), 6-17.

Md Abu Sayed, Badruddowza, Md Shohail Uddin Sarker, Abdullah Al Mamun, Norun Nabi, Fuad Mahmud, Md Khorshed Alam, Md Tarek Hasan, Md Rashed Buiya, & Mashaeikh Zaman Md. Eftakhar Choudhury. (2024). COMPARATIVE ANALYSIS OF MACHINE LEARNING ALGORITHMS FOR PREDICTING CYBERSECURITY ATTACK SUCCESS: A PERFORMANCE EVALUATION. The American Journal of Engineering and Technology, 6(09), 81–91. https://doi.org/10.37547/tajet/Volume06Issue09-10

Md Al-Imran, Salma Akter, Md Abu Sufian Mozumder, Rowsan Jahan Bhuiyan, Tauhedur Rahman, Md Jamil Ahmmed, Md Nazmul Hossain Mir, Md Amit Hasan, Ashim Chandra Das, & Md. Emran Hossen. (2024). EVALUATING MACHINE LEARNING ALGORITHMS FOR BREAST CANCER DETECTION: A STUDY ON ACCURACY AND PREDICTIVE PERFORMANCE. The American Journal of Engineering and Technology, 6(09), 22–33. https://doi.org/10.37547/tajet/Volume06Issue09-04

Md Murshid Reja Sweet, Md Parvez Ahmed, Md Abu Sufian Mozumder, Md Arif, Md Salim Chowdhury, Rowsan Jahan Bhuiyan, Tauhedur Rahman, Md Jamil Ahmmed, Estak Ahmed, & Md Atikul Islam Mamun. (2024). COMPARATIVE ANALYSIS OF MACHINE LEARNING TECHNIQUES FOR ACCURATE LUNG CANCER PREDICTION. The American Journal of Engineering and Technology, 6(09), 92–103. https://doi.org/10.37547/tajet/Volume06Issue09-11

Bahl, S., Kumar, P., & Agarwal, A. (2021). Sentiment analysis in banking services: A review of techniques and challenges. International Journal of Information Management, 57, 102317.

Ashim Chandra Das, Md Shahin Alam Mozumder, Md Amit Hasan, Maniruzzaman Bhuiyan, Md Rasibul Islam, Md Nur Hossain, Salma Akter, & Md Imdadul Alam. (2024). MACHINE LEARNING APPROACHES FOR DEMAND FORECASTING: THE IMPACT OF CUSTOMER SATISFACTION ON PREDICTION ACCURACY. The American Journal of Engineering and Technology, 6(10), 42–53. https://doi.org/10.37547/tajet/Volume06Issue10-06

Rowsan Jahan Bhuiyan, Salma Akter, Aftab Uddin, Md Shujan Shak, Md Rasibul Islam, S M Shadul Islam Rishad, Farzana Sultana, & Md. Hasan-Or-Rashid. (2024). SENTIMENT ANALYSIS OF CUSTOMER FEEDBACK IN THE BANKING SECTOR: A COMPARATIVE STUDY OF MACHINE LEARNING MODELS. The American Journal of Engineering and Technology, 6(10), 54–66. https://doi.org/10.37547/tajet/Volume06Issue10-07

Awoyemi, J. O., Adetunmbi, A. O., & Oluwadare, S. A. (2017). Credit card fraud detection using machine learning techniques: A comparative analysis. Journal of Applied Security Research, 12(4), 1–14. https://doi.org/10.1080/19361610.2017.1315696

Bhowmik, D. (2019). Detecting financial fraud using machine learning techniques. International Journal of Data Science, 6(2), 102-121. https://doi.org/10.1080/25775327.2019.1123126

Dal Pozzolo, A., Boracchi, G., Caelen, O., Alippi, C., & Bontempi, G. (2015). Credit card fraud detection: A realistic modeling and a novel

Bashir, A., Yousaf, M., & Awan, M. U. (2021). Customer churn prediction in mobile banking: A machine learning approach. International Journal of Information Management, 58, 102303. https://doi.org/10.1016/j.ijinfomgt.2021.102303

Chen, Y., Chen, H., & Liu, Y. (2021). The impact of customer satisfaction on customer retention in the fintech industry: A case study. Journal of Financial Services Marketing, 26(2), 85-97. https://doi.org/10.1057/s41264-021-00110-7

Dewan, S., Wu, D. J., & Trivedi, R. (2020). Customer engagement in fintech: A study of factors affecting retention. Journal of Banking and Finance, 118, 105888. https://doi.org/10.1016/j.jbankfin.2020.105888

Li, J., Liu, X., & Zhang, Y. (2021). Predicting customer churn in mobile banking: A case study of a Chinese fintech firm. Journal of Retailing and Consumer Services, 59, 102393. https://doi.org/10.1016/j.jretconser.2021.102393

Ransom, C. (2021). The cost of customer acquisition versus retention in fintech. Harvard Business Review. Retrieved from https://hbr.org/2021/03/the-cost-of-customer-acquisition-versus-retention-in-fintech

Wang, Q., Zhang, H., & Liu, S. (2022). A gradient boosting approach for customer churn prediction in digital financial services. Expert Systems with Applications, 208, 118267. https://doi.org/10.1016/j.eswa.2022.118267