Data Security in LLM

Uploaded by

samarth.infosec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views15 pages

Data Security in LLM

Uploaded by

samarth.infosec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

A Survey on Data Security in Large Language Models

Kang Chena,b,1 , Xiuze Zhouc,1 , Yuanguo Lina,∗ , Jinhe Sua , Yuanhui Yua,∗ , Li Shend and Fan Line
a School of Computer Engineering, Jimei University, Xiamen, 361021, China
b College of Science, Mathematics and Technology, Wenzhou-Kean University, Wenzhou, 325060, China
c The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, 511453, China
d School of Professional Studies, New York University, New York, 10003, United States
e School of Informatics, Xiamen University, Xiamen, 361102, China

ARTICLE INFO ABSTRACT

Keywords: Large Language Models (LLMs), now a foundation in advancing natural language processing, power
Large language model (LLM) applications such as text generation, machine translation, and conversational systems. Despite their
Data security transformative potential, these models inherently rely on massive amounts of training data, often col-
arXiv:2508.02312v1 [cs.CR] 4 Aug 2025

LLM vulnerabilities lected from diverse and uncurated sources, which exposes them to serious data security risks. Harmful
Prompt injection or malicious data can compromise model behavior, leading to issues such as toxic output, hallucina-
tions, and vulnerabilities to threats such as prompt injection or data poisoning. As LLMs continue to
be integrated into critical real-world systems, understanding and addressing these data-centric security
risks is imperative to safeguard user trust and system reliability. This survey offers a comprehensive
overview of the main data security risks facing LLMs and reviews current defense strategies, including
adversarial training, RLHF, and data augmentation. Additionally, we categorize and analyze relevant
datasets used for assessing robustness and security across different domains, providing guidance for
future research. Finally, we highlight key research directions that focus on secure model updates,
explainability-driven defenses, and effective governance frameworks, aiming to promote the safe and
responsible development of LLM technology. This work aims to inform researchers, practitioners,
and policymakers, driving progress toward data security in LLMs.

1. Introduction impacts, such as the spread of false information and the re-
inforcement of harmful stereotypes. By manipulating pub-
Large Language Models (LLMs), which exhibit near-
lic opinion, fostering confusion, and advancing detrimen-
human performance on tasks ranging from free-form text
tal ideologies, the intentional dissemination of misinforma-
generation and summarization to machine translation and tion may cause substantial societal harm [50]. Threats, such
open-domain question answering, represent a transformative as jailbreaking, in which adversaries circumvent safety fil-
leap in natural language processing. The ability of LLMs to ters via crafted prompts; data poisoning, which injects mali-
model complex linguistic dependencies and generate coher- cious samples into training corpora; and inadvertent leakage
ent, context-aware outputs has resulted in widespread adop-
of personally identifiable information (PII) all illustrate the
tion in both academic research and industrial applications,
dual-edged nature of web-scale data ingestion. These threats
fueling speculation about their role as precursors to Artificial can manifest at multiple stages in the LLM lifecycle, thereby
General Intelligence (AGI). This surge in capability under- compromising model outputs, undermining trust, and ex-
scores the significance of LLMs, not only as powerful com- posing sensitive data. Moreover, the lack of transparency
putational tools, but also as foundational building blocks for in training data provenance further exacerbates these risks.
next-generation AI systems. Also, it has become regarded
Studies have shown that even small amounts of toxic, bi-
as an excellent contextual learner [18]. The extensive use of
ased, or copyrighted content in a training set can dispropor-
LLMs marks the beginning of a new paradigm in seamless tionately affect model behavior [9]. With the ever-widening
knowledge transfer for diverse natural language processing scale of LLMs, ensuring dataset integrity becomes increas-
applications [53]. ingly critical - not only to prevent harmful generations but
Despite their remarkable strengths, LLMs are beset by a
also to uphold legal and ethical standards. Recent work high-
variety of security and privacy vulnerabilities that threaten
lights the urgency of constructing curated and auditable train-
both model integrity and user confidentiality. Given their ing corpora to mitigate these issues [3]. Without such safe-
dependence on massive training datasets, these models are guards, LLMs remain susceptible to data-centric threats, which
susceptible to malicious or biased information, which can can subtly or overtly distort their outputs.
result in the generation of inaccurate or inappropriate con- To address these concerns, a range of protective meth-
tent. This raises serious concerns about potential negative
ods has been developed. These methods assist legal profes-
∗ Corresponding authors sionals in navigating increasingly complex data protection
[email protected] (K. Chen); [email protected] regulations and enhance their comprehension of compliance
(X. Zhou); [email protected] (Y. Lin); [email protected] (J. Su); requirements related to data processing and storage. Key
[email protected] (Y. Yu); [email protected] (L. Shen); [email protected] (F.
Lin)
data security protection methods include adversarial train-
ORCID (s): 0000-0002-0717-6936 (X. Zhou) ing [44], Reinforcement Learning from Human Feedback
1 Co-first authors (RLHF) [49], and data augmentation techniques [20]. These