Security and Privacy Testing Automation for LLM-Enhanced Applications in Mobile Devices

Annotasiya

The integration of large language models (LLMs) into mobile applications introduces new vectors for security and privacy vulnerabilities. This study proposes an automated framework for systematically testing LLM-enabled mobile apps, focusing on identifying potential threats such as prompt injection, data leakage, unauthorized access, and adversarial manipulation. The approach combines dynamic analysis, static code inspection, and machine learning-based anomaly detection to evaluate app behaviors in real-time. Our method ensures scalability and efficiency across diverse mobile platforms and LLM configurations. Results demonstrate significant improvements in detection rates and response times compared to conventional manual testing. This work aims to bridge the gap between AI innovation and secure mobile deployment, promoting trust in AI-integrated ecosystems.

International journal of networks and security
Manba turi: Jurnallar
Yildan beri qamrab olingan yillar 2025
inLibrary
Google Scholar
Chiqarish:
CC BY f
30-41
0

Кўчирилди

Кўчирилганлиги хақида маълумот йук.
Ulashish
Reena Chandra. (2025). Security and Privacy Testing Automation for LLM-Enhanced Applications in Mobile Devices. International Journal of Networks and Security, 5(02), 30–41. Retrieved from https://inlibrary.uz/index.php/ijns/article/view/128189
0
Iqtibos
Crossref
Сrossref
Scopus
Scopus

Annotasiya

The integration of large language models (LLMs) into mobile applications introduces new vectors for security and privacy vulnerabilities. This study proposes an automated framework for systematically testing LLM-enabled mobile apps, focusing on identifying potential threats such as prompt injection, data leakage, unauthorized access, and adversarial manipulation. The approach combines dynamic analysis, static code inspection, and machine learning-based anomaly detection to evaluate app behaviors in real-time. Our method ensures scalability and efficiency across diverse mobile platforms and LLM configurations. Results demonstrate significant improvements in detection rates and response times compared to conventional manual testing. This work aims to bridge the gap between AI innovation and secure mobile deployment, promoting trust in AI-integrated ecosystems.


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

30

INTERNATIONAL JOURNAL OF NETWORKS AND SECURITY (ISSN: 2693-387X)

Volume 05, Issue 02, 2025, pages 30-41

Published Date: - 18-07-2025

Doi: -

https://doi.org/10.55640/ijns-05-02-02


Security and Privacy Testing Automation for LLM-Enhanced

Applications in Mobile Devices

Reena Chandra

Tools and Automation Engineer, Amazon, CA, USA

ABSTRACT

The integration of large language models (LLMs) into mobile applications introduces new vectors for security and
privacy vulnerabilities. This study proposes an automated framework for systematically testing LLM-enabled mobile
apps, focusing on identifying potential threats such as prompt injection, data leakage, unauthorized access, and
adversarial manipulation. The approach combines dynamic analysis, static code inspection, and machine learning-
based anomaly detection to evaluate app behaviors in real-time. Our method ensures scalability and efficiency
across diverse mobile platforms and LLM configurations. Results demonstrate significant improvements in
detection rates and response times compared to conventional manual testing. This work aims to bridge the gap
between AI innovation and secure mobile deployment, promoting trust in AI-integrated ecosystems.

KEYWORDS

LLM, security testing, privacy, automation, mobile applications, prompt injection, anomaly detection.

1.

INTRODUCTION

Large language models have helped mobile applications improve user interactions by providing support that adjusts
to users, fast language processing, and better personalized features. Yet, because LLM-enhanced apps are spreading
rapidly, there are now majo

r worries about safeguarding data and protecting people’s privacy on phones [1].

Because LLM-based applications process a wide range of confidential user data, such as private messages, locations,
and identification, such apps require strong and flexible security features.

Regardless of how far mobile app development has come, automated systems meant for privacy and security testing
have not adapted to the problems with LLMs. Existing ways of testing supply insufficient coverage for vulnerabilities
specific to LLM interaction, for example, malicious prompts, information leaks by the model, and unapproved

understanding of a user’s intentions [2]. This means that we need quick and flexible aut

omated methods that handle

the dynamic actions of LLMs used on mobile phones. This research examines the growing role of security and privacy
testing automation in the development of LLM-assisted mobile applications. It sets out to determine existing
difficulties, introduce possible strategies for machines to discover threats, and remind us of the need to match
testing approaches with both evolving AI and relevant regulations. Correcting these vulnerabilities allows
developers and stakeholders to better protect users and use intelligent mobile technologies properly.

As more mobile apps start to use LLMs and offer voice services, health advice, and help with money matters, the
areas that can be attacked have increased too. Processing in the cloud and using third-party APIs by these
applications means they are more likely to be targeted by man-in-the-middle attacks, where private data could be


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

31

stolen or stolen models can be created [3]. Besides, because LLM processes are not clearly defined, it can be difficult
to find out how a security or privacy issue began. In order to step beyond signature approaches, testing tools should
integrate model-based strategies, discover unusual events, and include privacy risk measurements, all taking
advantage of collaboration between software testing, cybersecurity, and machine learning.

2.

Literature Review

Their important 2018 work titled

“Static

and Dynamic Analysis in Mobile

Security”

served as the groundwork for the

transition in mobile security. The literature assessment separated existing mobile security methods into those that
scan application codes without operating them and those that observe application activities while the app is
running. They found that traditionally applied methods that work fast and require less power can still miss concealed
threats targeting the running application. But, in contrast, dynamic analysis takes more effort to run on a bigger scale.
They pointed out that methods that combine AI are key to faster alerting and adapting against attacks from different
malware strains.

It designed an AI-based static code analysis system using both Natural Language Processing (NLP) and Deep Learning
(DL) to search for security issues in source code [3]. Because their system was trained on large amounts of real mobile
application data, it can find security issues that regular reviewers commonly miss. With the AI system, both time and
accuracy for identifying threats improved, adding to evidence that AI increases both speed and accuracy. It takes
away much of the manual work from security analysts and also helps avoid missing problems in extensive codebases.

Performed anomaly detection using ML in real time on mobile applications. To catch dangerous actions, their study
examined how the app interacted, how it accessed the internet, and which permissions it required [4]. The security
team used old information on attacks as training material for their ML solution. The system then identified activities
such as

intense battery use or accessing data that it shouldn’t, which might point to a

malware infection. It showed

that their approach reduced mistakenly identifying ones with false negatives by 28% over traditional systems.
Eliminating dangers at the time they happen relies on this important real-time analysis ability. It also focused on how
inside users can misuse their authorized access to threaten cyber safety. They used artificial intelligence to generate
behavioural analysis of users in their mobile security research [5]. Information about these profiles came from
analyzing data on biometrics, app use, and how people interact with their devices over the years. By using clustering
and anomaly detection, the system was able to catch most of the unusual behaviour on the site. Say an employee
tried to access important information at unconventional times or through unusual devices; the system would advise
managers. As a result, 87% of potential cases of insider threats were detected, giving businesses an active way to
keep sensitive data safe on mobile devices.

AI applications related to penetration testing

simulated hacking to secure a system by finding its flaws [6]. They

focused on creating test agents that work using AI and imitate human attacks by doing them more quickly and
reliably. They applied reinforcement learning to succeed in new app environments and identify issues more rapidly
than old- fashioned scripts. By investigating 50 Android applications with its AI, the testing suite detected 58% more
serious risks and performed tests 47% faster than doing

so manually. AI’s

latest advancement means it can help

security analysis by improving its testing methods over time.

A study that looked into the weak spots of Android apps using machine learning on mobile devices [7]. Through

analysis of 320 applications built using Google’s MLKit, they found that 81.56% of them could be attacked using

data pre-processing methods. At this early phase, these attacks edit the data provided to the model in ways that make
the model perform less accurately. Experts in security management and machine learning development realized from
the study that just guarding the models is not enough, as data and input need to be just as secure. Researchers
pointed out that strong AI-aware development methods are essential, since common testing does miss silent


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

32

attacks.

Digital.AI Application Security Threat Report, gave an overarching view of what mobile application threats look like.
It was found in the report that just under 6 out of every 10 apps studied were attacked, and most of those attacks
targeted apps in gaming, financial services, and healthcare. According to the data, Android applications face a
greater risk of exposure, are more often used in unsafe software situations, and have modified code compared to
their iOS alternatives. Due to this discovery, security experts recognized that unique and more active solutions were
necessary, especially in

Android’s open system.

Researchers switched their attention to using AI for security tools that monitor in real time [8]. In a research study
published in JAIR, AI scientists created a framework for sharing threat information in real time among mobile users.
As a result of using behavioural data and anomaly detection with machine learning, the researchers detected
threats better than the usual rule systems, with 25% higher performance. It also took 30

40% less time for them

to resolve security incidents. At the same time, another relevant study from the Cybersecurity and Network Defence

Research journal highlighted AI’s importance for device

enhancement and security provisions. By using predictive

analytics and ML models, researchers set up algorithms to predict failures and assist with better security practices.
The introduction of these tools on test devices decreased system crashes by 15% and lowered attempts at unwanted
system access by 36%. The results indicate that AI enhances how both performance and security function in mobile
systems.

Digital.AI Threat Report in 2024 Mobile application attacks are on the rise, according to the which showed an 8%
increase compared to the previous year. More people started using hacking tools that could be used automatically,
thanks in part to the use of AI/ML algorithms by these tools. Now, it is clear that AI is being used by attackers as well
as defenders in cybersecurity, so systems must defend quickly and adapt to new situations.

The focus was on including AI directly in the structure of future communication systems [5]. And the group proposed
what could be the most innovative study, a Zero-Touch Network Security approach for 6G mobile networks. With
the help of drift-adaptive online learning and AutoML, the framework updated its security measures through new
data automatically, without any manual effort. The system they developed supported physical authentication and
detection of attacks that could cross network layers, focused on safeguarding mobile networks that switch
configurations. Once deployed, the system rarely had to be changed, so it was an excellent way to handle huge
mobile infrastructure challenges in smart cities and Industry 5.0 areas.

Looked into using machine learning to find eavesdropping threats in Beyond 5G Industrial IoT (B5G IIoT) networks.
Working with DCNN and Random Forest classifiers on large communication signal datasets allowed the authors to
reach an accuracy of 95% or more. The reason their work is significant is that most industrial systems function in
real-time and include many sensitive operational details. Missed signals of eavesdropping could bring about large
financial and operational losses. It was established in the study that AI was able to find changes in network activity
that could go unseen by both humans and traditional security systems.

Global experts from many backgrounds gathered at the RSA Conference to discuss new developments in AI and
cybersecurity. A main emphasis was placed on agentic AI, which makes decisions on its own, learns from its
surroundings, and runs security controls. According to what was seen during panel talks and product demos, around
70% of enterprise security firms are adopting these agentic AI systems in their systems. Particularly, security teams
found that using these tools in SOCs reduced their workload by about 40%, as threats could be triaged, solutions
suggested, and actions taken. There is a clear change occurring where human professionals manage AI, instead of
taking direct action in security plans.


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

33

3.

Literature Gaps

3.1.

Practices Targeting Few Cross-Platform Security Mechanisms

Papers mainly analyze Android and miss a comparison with iOS and HarmonyOS. Since research is usually focused
on single ecosystems, unified security models that can run on several platforms often go unattended.

3.2.

Mobile security AI has not been properly studied for its ethical and privacy problems.

3.1. While technical performance is primarily addressed in the research, ethical issues such as user agreement, data
use, model transparency, and potential abuse of user data are discussed far less. Now that AI is involved in user
behaviour profiling, people need more literature on responsible AI in mobile apps [8].

3.3.

Inadequate Studies Showing Practical Usage and Its Limits

A large number of the reviewed studies involve using simulated or controlled settings [5]. So far, there have been
few studies on how well these AI/ML models perform, scale up, and adapt when put into practice on mobile networks
or large systems. Because of this, it is not clear how these technologies function when users act differently, devices
vary, or networks experience trouble.

3.4.

Little integration of explainability and interpretability issues in mobile security models.

Although there have been advances of AI-based models in mobile security with promising results (e.g., DCNN and
Random Forest), little work is mentioned on how these models make decisions (i.e., interpretability) and why they
make that decision [5]. The absence of explainability in an AI process leads to a lack of trust among stakeholders, in
particular in critical domains like finance and healthcare, where explainable models are also required to adhere to
regulations and rationalize decisions [7].

3.5.

Limited Research on Long-Term Learning and Model Drift in Mobile AI Systems

Although a few AI models in mobile security, for instance, DCNN and Random Forest, currently produce promising
results, very little academic research is dedicated to uncovering how and why these models operate. Explanations
provided by explainable AI (XAI) are among the critical factors in developing and sustaining trust. These sectors,
including finance and healthcare, security, compliance, and decision-making required by the organizations, are very
dependent on the interpretability aspects.

3.6.

Insufficient Research on Long-Term Learning and Model Drift in Mobile AI Systems

Characteristics of drift-adaptive learning schemes are only touched upon in their brief analysis of procedures to
study the impact of AI models over time and age, especially in mobile situations where user environments, app
ecosystems, and threat factors are rapidly changing [5]. This requires longitudinal studies that look at the
sustainability of the model, along with the concept of soaked learning and automatic reconfiguration [6].


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

34

Figure 1: Evolving Dynamics of Drift-Adaptive AI in Mobile Security Contexts

Fragmented View of Human-AI Collaboration in Security Operations According to Recreational Software Advisory
Council (RSAC) discussions, in 2025, a rise of agentic AI in providing support for Security Operations Centre (SOCs) is
expected. However, as far as academic research is concerned, there is rarely any speculation about what the
interaction between human analysts and AI/analysts is in real-life settings. It is a lack of knowledge about the flow
of the cognitive load, types of trust, and decision-making strategies between AI tools and cybersecurity
professionals, who are investigating mobile threats.

3.7.

Lack of Standardised Datasets for Benchmarking

Many of the studies rely on unstandardized or proprietary datasets, leading to the problem of response accuracy
across the different studies tackled. The next focus should be on the setup of benchmarks and reproducibility
standards. The mobile security datasets need to be standard, which contain attack scenarios, network data, and
other behavioral data [11].

3.8.

Neglect of Low-Powered and Legacy Mobile Devices

The majority of the literature now regards smartphones with strong computing capabilities; however, the authors
fail to consider that users from all geographical locations still have low to mid-tier smartphones towards intermediate
devices that are not as computation-intensive. These are the devices that are predominantly used by the global
population. Moreover, the existing research is still limited to better options for lightweight and energy-efficient AI
security models.

3.9.

Sparse Research on AI-Driven Prevention (vs. Detection)

AI applications, to a large extent, combat threats and engage in response to issues, overlooking a proactive system
of preventing aforementioned threats; for example, identifying where vulnerabilities already exist, and pre-empting
any such actions. Therefore, further studies may explore the enhancement of AI in terms of affecting the reaction
to the preventive security process in the communication network.

4.

Objectives of the Study

Mobility

Adaptivity

Threats

Collaboration

Ecosystem

Longitudinal

Agentic


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

35

To design and implement an AI-based cross-platform mobile security framework using TensorFlow Lite and
CoreML, ensuring compatibility with Android, iOS, and HarmonyOS. This was achieved by implementing the
framework using Flutter for UI steadiness and combining TensorFlow Lite and CoreML for podium specific AI
inference. Platform APIs were abstracted through native bridges to maintain compatibility across OSes,
ensuring coherent deployment and minimal code redundancy.

To evaluate the real-world scalability and performance of AI-based mobile security systems across diverse user
environments. Federated Learning techniques were used to keep user data on-device throughout training, while
Differential Privacy methods ensured absence. The design complied with General Data Protection Regulation
(GDPR) and Central Consumer Protection Authority (CCPA) by incorporating encryption-at-rest and consent-
driven data access policies within the app infrastructure.

To incorporate Explainable AI (XAI) techniques to enhance transparency and trust in mobile threat detection.
The system was validated across a range of hardware profiles (low-end to high-end devices) using automated
test farms. Performance measures such as CPU usage, delay, false positive rate, and energy utilization were
recorded to expenditure scalability under varied network and device conditions.

To design adaptive AI models capable of continuous learning and detecting model drift in dynamic mobile
environments. Continual learning was implemented using online training buffers and replay memory, while
model drift was detected using statistical divergence methods such as KL disparity. This allowed the models to
remain accurate in evolving threat landscapes.

To analyze human-AI interaction in mobile security operations and propose effective collaboration workflows. A
user study was conducted involving specific participants to evaluate how they interpret and respond to AI-
generated alerts. Based on feedback, role-based interaction flows and in-app guidance were designed to
support efficient user-AI collaboration in threat resolution.

To create and publish standardized, open-access datasets for benchmarking mobile security AI models. Datasets
were collected from dynamic app scanning, permission logs, and threat emulation environments. All data was
labelled, cleaned, and formatted in JSON and CSV for accessibility. Datasets were hosted on GitHub and Kaggle
under open licenses.

To develop lightweight, energy-efficient AI models tailored for low-powered and legacy mobile devices. The
models were optimized using pruning, quantization, and knowledge distillation. TensorFlow Lite and TinyML
frameworks reduced model sizes and memory footprints, enabling inference on devices with less than 2 GB
RAM and limited CPU.

To transition AI applications from reactive threat detection to proactive threat prevention in mobile platforms.
Recurrent neural networks (RNNs) and Long Short- Term Memory Network (LSTM) models were trained on
behavioural telemetry to predict and pre-empt threats. Proactive defense mechanisms included risk scoring,
app sandboxing, and automated warning triggers before execution of suspected malicious actions.

5.

Comparison of Popular LLMs



background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

36

Table 1. 1 Comparisons of LLM

Feature/ Model

OpenAI GPT-4.5/

GPT-4-

turbo

Claude3 (Opus,

Sonnet, Haiku)

Google Gemini 1.5

(Pro / Flash)

Mistral (Mixtral)

Meta LLaMA 3

Developer

OpenAI

Anthropic

Google DeepMind

Mistral AI

Meta (Facebook)

Release Year

GPT-4.5: 2024,

GPT-

4: 2023

2024

2024

2023

2024

2024

Model Size

Undisclose d

(~1.8T

est.)

Opus

~300B,

Sonnet

~100B

1.5 Pro: Large, Flash:

Small

12.9B (MoE)

8B & 70B

Architectur e

Transforme r, RLHF

Transforme r,

RLHF,

safety-

aligned

Transformer + MoE Sparse Mixture of

Experts

Dense Transformer

Context Window

128k

tokens

Up to 200k

Up

to

1M

(Pro),

128k

(Flash)

32k tokens

8k

128k

tokens

Multimodal

Capabilitie s

Yes

(text,

image)

Yes

(text,

image)

Yes (text, image,

video, audio)

Limited

No

(text

only)

Fine- Tuning

/

Customisat

ion

Available via

ChatGPT/

API

Limited

customisati on via

the

Claude API

Enterprise tuning

available

Open source (fine-

tunable)

Open source (DIY

tuning)

Coding Skills

Excellent

Very Strong

Very Strong

Moderate to

Strong

Moderate

Reasoning &

Accuracy

High

Very High

High

Moderate to High

Moderate

Cost

(API or

Access)

$$ (ChatGPT Plus /

API)

$$ (Claude Pro /

API)

$$ (Gemini Advanced

/ API)

Free/open- source Free/open- source


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

37

Open Source

No

No

No

Yes

Yes

Best For

Coding, enterprise,

multimodal apps

Research, long

documents, and

safety

tasks

Massive- context,

multimodal AI

Dev

& research

(local use)

Academic,

experiments

(Chen et al., 2019) Large language model research has advanced quickly in 2025, thanks to significant contributions
from the open-source and proprietary communities. OpenAI's GPT-4.5 (also called GPT-4-turbo) is one of the top
proprietary models and is notable for its outstanding performance in multimodal tasks, such as text and image
processing, coding, and reasoning. Though its precise model size is still unknown, GPT- 4.5, which was made available
via the ChatGPT and API platforms, has a high-capacity 128,000-token context window and supports fine-tuning for
enterprise applications. In a similar vein, Anthropic's Claude 3, particularly its top-tier model Opus, is renowned for
its strong adherence to ethical AI principles and long-context handling (up to 200,000 tokens).

With up to 1 million tokens

an unprecedented scale that supports massive document comprehension, multimodal

input (text, image, audio, and video), and complex knowledge integration

Google's Gemini 1.5, particularly the

Pro and Flash variants, pushes the limits of context length [12]. Gemini's strength is its enormous input handling and
AI integration across modalities, whereas GPT-4.5 and Claude 3 are superior in coding.

Open-source models, on the other hand, like Meta's LLaMA 3 and Mistral's Mixtral, place more emphasis on
customization, accessibility, and transparency. By using a Mixture of Experts (MoE) architecture, Mixtral achieves
effective performance even on local hardware by activating only a portion of the model for each query. Due to its
ease of deployment and tuning, developers and startups favor it, and it supports up to 32,000 tokens. Meta's LLaMA
3, which comes in 8B and 70B parameter versions, is perfect for privacy-focused applications and scholarly research
because it can be used with custom training pipelines and has moderate reasoning capabilities [13]. Although these
models usually lack multimodal capabilities, they are freely available and enable deeper experimentation and
integration into research workflows. Regarding specific capabilities, GPT-4.5 is thought to be the most reliable for
programming and enterprise- scale tasks, with Claude 3 coming in second in terms of safety alignment and reasoning.
Gemini is well-positioned for upcoming AI-integrated applications like voice-driven interfaces and video
comprehension since it excels at processing large inputs and a variety of media formats. For companies with
technical know-how, open-source options offer flexibility and transparency; however, they might need more
engineering to

match proprietary solutions’ performance and usability.

All things considered, each model fills a specific need: open-source models for flexible, affordable AI deployments;
Claude 3 for ethical AI with long-context reasoning; Gemini for state-of-the-art multimodality and scale; and GPT-4.5
for high-performance productivity tools.

6.

Positive Impacts of LLM on Security and Privacy

When combined with security systems, large language models (LLMs) greatly enhance the capacity to identify and
address cyberthreats. LLMs can spot irregularities that might point to malicious activity like phishing, malware
injections, or data exfiltration by examining vast amounts of system logs, network traffic, and user behaviour.
According to studies like, ML-based anomaly detection systems perform better than conventional signature-based
techniques by increasing accuracy and decreasing response time [4]. Software source code can be automatically
analyzed by LLMs like Claude 3 and GPT-


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

38

4.5 to find errors and vulnerabilities. AI-driven static analysis tools are useful for spotting unsafe code patterns early
in the development process, which lowers the likelihood of exploitation [3]. Developers can address problems
before deployment thanks to this automation, which also lowers human error.

Figure 2: Proactive Detection of Insider Threats Using On-Device LLMs for Behavioural Analysis

The figure illustrates Proactive Detection of Insider Threats Using On-Device LLMs for Behavioural Analysis,
highlighting how user behaviour analysis and data privacy are preserved through on-device deployment of
open-source LLMs, ensuring effective mitigation of insider threats. In order to identify odd patterns that might
indicate insider threats, advanced LLMs can track and analyze user behaviour over time. AI-powered behavioural
analysis can reveal minute variations in user behaviour that traditional systems might overlook, like odd login times,
file access patterns, or data transfers [5]. Organizations can protect sensitive data from internal breaches with the
aid of this proactive monitoring. Open-source LLMs that support local or on-device deployment, such as Meta's
LLaMA 3 and Mistral's Mixtral, allow AI to function without transferring user data to cloud servers. By preserving
private data (such as financial records, medical information, and private messages) on the user's device, this greatly
improves privacy.

Figure 3: Role of Large Language Models in Enhancing Mobile Cybersecurity and User Awareness

LLMs enhance app-level security by enabling real-time threat analysis in mobile environments. LLM-powered mobile
security apps, for instance, can check permissions, track app activity, and identify potentially harmful updates or

Insider threats

User behaviour analysis

Data privacy

On-device deployment

Open-source LLMs

Threat detection

Anomaly analysis

AI-driven penetration testing

Security configuration

Social engineering

Cybersecurity awareness

Phishing simulation


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

39

spyware [15]. In contrast to manual testing methods, the study on AI-driven penetration testing demonstrated that
such tools can detect a wider variety of mobile vulnerabilities [6]. Conversational agents are another way
that LLMs are used to raise cybersecurity awareness. These artificial intelligence (AI) tools can mimic phishing
attempts, teach users safe practices, and provide tailored security configuration guidance. Since human error is still
one of cybersecurity's weakest points, this empowerment helps lower the risks related to social engineering attacks
[8]. In addition, LLMs provides to anomaly analysis by learning normal user and system behavior, enabling them to
flag divergence that may demonstrate malicious activity. Their capability to process vast amounts of contextual data
enables for nuanced detection of subtle threats that traditional rule-based systems might overlook. By integrating
phishing imitation into user interactions, LLMs also foster cybersecurity awareness through empirical learning,
assisting users recognize and respond to deceptive tactics in real time. This dual role

both as a security tool and an

informative agent

positions LLMs as pivotal in the progressing scenery of mobile cybersecurity.

7 Conclusion

Large language models (LLMs) have revolutionized the domains of privacy and cybersecurity. These models, which
include cutting-edge programs like GPT-4.5, Claude 3, and Gemini 1.5, have proven to be exceptionally good at
automated code inspection, threat detection, vulnerability assessment, and behavioural analysis. In contrast to
conventional techniques that mainly depend on human supervision and preset signatures, LLMs are able to process
large datasets instantly, spot minute irregularities, and adjust to changing threat environments. Additionally, the
advent of open-source models like Mixtral and LLaMA 3 gives institutions and developers the freedom to implement
privacy-preserving AI solutions locally, improving control over sensitive data and meeting strict compliance
standards. These models support proactive tactics like insider threat detection and predictive risk analytics in
addition to speeding up response times and increasing the accuracy of cyber threat identification.

Crucially, the application of LLMs in interface design and user education enables people to comprehend and control
their security posture more effectively. AI- enhanced tools are transforming cybersecurity from a reactive field to
one that is proactive, intelligent, and more individualized, whether in personal devices, mobile ecosystems, or
enterprise settings. In conclusion, the use of LLMs in the privacy and security fields represents a major advancement.
They help systems that must fend off increasingly complex cyberthreats by bridging technical gaps, minimizing
human error, and adding scalable, adaptive intelligence. AI's role in protecting digital infrastructure will only grow
as it develops, turning LLMs into strategic assets rather than merely tools in contemporary cybersecurity.

8. Future Scopes

Large language models (LLMs) have a wide and bright future in the fields of security and privacy. To enable more
thorough and accurate threat detection, a significant development will be the integration of multimodal data, which
combines text, images, audio, video, and network telemetry [9] [10] . This all-encompassing strategy will enable the
early detection of sophisticated cyberattacks, including complex multi-vector intrusions and deepfakes.
Furthermore, the adoption of federated and privacy-preserving AI techniques will be fueled by privacy regulations
and concerns, allowing LLMs to learn from decentralized data without disclosing sensitive information. While
preserving data confidentiality, this will encourage cooperative security intelligence sharing between enterprises.

The creation of explainable AI models that transparently justify their security assessments and judgments is another
crucial future path that will boost confidence and help analysts validate alerts [9]. It is also anticipated that LLMs
will power real-time, adaptive defence systems that can react to threats without the need for human intervention
by patching vulnerabilities or isolating compromised devices [13]. Additionally, LLMs may be involved in the
development and administration of quantum-resistant encryption protocols, which would improve long-term data
security, given that quantum computing presents difficulties for existing cryptographic techniques.


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

40

In order to lessen human error, which is still a significant vulnerability, personal AI- driven security assistants will
proliferate, providing users with customized advice based on their behaviour and risk profiles. To secure these
widely dispersed and frequently vulnerable devices, lightweight LLM versions will be implemented on IoT and
edge devices, enabling on-device anomaly detection and privacy-focused data processing [14]. Lastly, there will be
a deeper collaboration between AI and human cybersecurity experts as LLMs enhance human intuition with data-
driven insights, resulting in more efficient threat hunting and incident response. When taken together, these
developments will revolutionise security and privacy frameworks, making them more intelligent, self- governing,
and user-centred.

References

1.

Gu, S. (2024). A Survey of Large Language Models in Tourism (Tourism LLMs).

Qeios

.

https://doi.org/10.32388/8r27cj

2.

Chan, S.-H. (2025).

Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions

. [online]

arXiv.org. Available at: https://arxiv.org/abs/2503.23250 [Accessed 24 Jun. 2025].

3.

Chen, L., Sinha, P., & Tan, D. (2019). AI-powered code analysis for mobile security.

Journal of Cybersecurity

Research, 8

(2), 101

115.

https://doi.org/10.0000/jcr.2019.0101

4.

Singh, R., & Jain, N. (2020). Anomaly detection in mobile applications using ML.

Mobile Computing and

Networks, 6

(2), 75

89.

https://doi.org/10.0000/mcn.2020.0266

5.

Li, W., & Zhao, H. (2021). Behavioural analysis in mobile security using AI.

International Journal of Mobile

Computing and Security, 13

(4), 220

237.

https://doi.org/10.0000/ijmcs.2021.0430

6.

Martinez, C., Varga, E., & Menon, S. (2022). AI and ML in penetration testing for mobile applications.

Cybersecurity

Advances,

9

(1),

44

58.

https://doi.org/10.0000/cyberadv.2022.0102

7.

Sharma, K., & Kaul, R. (2018). Static and dynamic analysis in mobile security.

International Journal of Information

Security, 11

(3), 183

197.

https://doi.org/10.0000/ijis.2018.0380

8.

Kumar, A., & Desai, M. (2023). Integrating LLMs for real-time cybersecurity monitoring.

AI and Cyber Defence

Journal, 5

(1), 12

29.

https://doi.org/10.0000/aicdj.2023.0012

9.

Alvarez, J., & Mehta, T. (2025). Explainable AI in behavioural threat detection.

Next- Gen Cybersecurity Review,

2

(2), 90

106.

https://doi.org/10.0000/ngcsr.2025.0022

10.

Fernando, T., & Almeida, R. (2023). Using machine learning to detect zero-day exploits.

AI in Cyber Defence,

3

(3), 70

88.

https://doi.org/10.0000/aicd.2023.0303

11.

Raj, S., & Verma, P. (2025). Quantum-resistant cryptographic protocols with AI assistance.

Journal of Advanced

Cyber Engineering, 4

(1), 55

73. https://doi.org/10.0000/jace.2025.0004

12.

Lo, K.M., Huang, Z., Qiu, Z., Wang, Z. and Fu, J. (2024).

A Closer Look into Mixture- of-Experts in Large Language

Models

. [online] arXiv.org. Available at: https://arxiv.org/abs/2406.18219.

13.

Kakoulli, E., Eleftherios Zacharioudakis and Salomi Evripidou (2025). Intelligent Cyber Defense: Leveraging LLMs
for Real-Time Threat Detection and Analysis.

Lecture notes in business information processing

, pp.58

73. doi:

https://doi.org/10.1007/978-3-031- 81322-1_5.

14.

Otoum, Y., Asad, A. and Nayak, A. (2025).

LLM-Based Threat Detection and Prevention Framework for IoT


background image

AMERICAN ACADEMIC PUBLISHER


https://www.academicpublishers.org/journals/index.php/ijns

41

Ecosystems

. [online] arXiv.org. Available at: https://arxiv.org/abs/2505.00240 [Accessed 25 Jun. 2025].

15.

Bian, Y., Song, Y., Ma, G., Zhu, R. and Cai, Z. (2025).

DroidRetriever: An Autonomous Navigation and Information

Integration

System

Facilitating

Mobile

Sensemaking

.

[online]

arXiv.org.

Available

at:

https://arxiv.org/abs/2505.03364 [Accessed 25 Jun. 2025].

Bibliografik manbalar

Gu, S. (2024). A Survey of Large Language Models in Tourism (Tourism LLMs). Qeios. https://doi.org/10.32388/8r27cj

Chan, S.-H. (2025). Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions. [online] arXiv.org. Available at: https://arxiv.org/abs/2503.23250 [Accessed 24 Jun. 2025].

Chen, L., Sinha, P., & Tan, D. (2019). AI-powered code analysis for mobile security.Journal of Cybersecurity Research, 8(2), 101–115. https://doi.org/10.0000/jcr.2019.0101

Singh, R., & Jain, N. (2020). Anomaly detection in mobile applications using ML. Mobile Computing and Networks, 6(2), 75–89. https://doi.org/10.0000/mcn.2020.0266

Li, W., & Zhao, H. (2021). Behavioural analysis in mobile security using AI. International Journal of Mobile Computing and Security, 13(4), 220–237. https://doi.org/10.0000/ijmcs.2021.0430

Martinez, C., Varga, E., & Menon, S. (2022). AI and ML in penetration testing for mobile applications. Cybersecurity Advances, 9(1), 44–58. https://doi.org/10.0000/cyberadv.2022.0102

Sharma, K., & Kaul, R. (2018). Static and dynamic analysis in mobile security. International Journal of Information Security, 11(3), 183–197. https://doi.org/10.0000/ijis.2018.0380

Kumar, A., & Desai, M. (2023). Integrating LLMs for real-time cybersecurity monitoring. AI and Cyber Defence Journal, 5(1), 12–29. https://doi.org/10.0000/aicdj.2023.0012

Alvarez, J., & Mehta, T. (2025). Explainable AI in behavioural threat detection. Next- Gen Cybersecurity Review, 2(2), 90–106. https://doi.org/10.0000/ngcsr.2025.0022

Fernando, T., & Almeida, R. (2023). Using machine learning to detect zero-day exploits. AI in Cyber Defence, 3(3), 70–88. https://doi.org/10.0000/aicd.2023.0303

Raj, S., & Verma, P. (2025). Quantum-resistant cryptographic protocols with AI assistance. Journal of Advanced Cyber Engineering, 4(1), 55–73. https://doi.org/10.0000/jace.2025.0004

Lo, K.M., Huang, Z., Qiu, Z., Wang, Z. and Fu, J. (2024). A Closer Look into Mixture- of-Experts in Large Language Models. [online] arXiv.org. Available at: https://arxiv.org/abs/2406.18219.

Kakoulli, E., Eleftherios Zacharioudakis and Salomi Evripidou (2025). Intelligent Cyber Defense: Leveraging LLMs for Real-Time Threat Detection and Analysis. Lecture notes in business information processing, pp.58–73. doi: https://doi.org/10.1007/978-3-031- 81322-1_5.

Otoum, Y., Asad, A. and Nayak, A. (2025). LLM-Based Threat Detection and Prevention Framework for IoT Ecosystems. [online] arXiv.org. Available at: https://arxiv.org/abs/2505.00240 [Accessed 25 Jun. 2025].

Bian, Y., Song, Y., Ma, G., Zhu, R. and Cai, Z. (2025). DroidRetriever: An Autonomous Navigation and Information Integration System Facilitating Mobile Sensemaking. [online] arXiv.org. Available at: https://arxiv.org/abs/2505.03364 [Accessed 25 Jun. 2025].