CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models | allinfosecnews.com

June 19, 2024, 4:19 a.m. | Yuetai Li, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Dinuka Sahabandu, Bhaskar Ramasubramanian, Radha Poovendran

cs.CR updates on arXiv.org arxiv.org

arXiv:2406.12257v1 Announce Type: cross
Abstract: The remarkable performance of large language models (LLMs) in generation tasks has enabled practitioners to leverage publicly available models to power custom applications, such as chatbots and virtual assistants. However, the data used to train or fine-tune these LLMs is often undisclosed, allowing an attacker to compromise the data and inject backdoors into the models. In this paper, we develop a novel inference time defense, named CleanGen, to mitigate backdoor attacks for generation tasks in …

applications arxiv attacks backdoor backdoor attacks chatbots cs.ai cs.cr custom custom applications data language language models large llms performance power remarkable train virtual

More from arxiv.org / cs.CR updates on arXiv.org

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory 19 hours ago | arxiv.org

ai assistants arxiv can contextual +21

On the evolution of data breach reporting patterns and frequency in the United States: a … 19 hours ago | arxiv.org

analysis arxiv breach breaches +23

Cross-silo Federated Learning with Record-level Personalized Differential Privacy 19 hours ago | arxiv.org

arxiv budget client clients +19

Stealing Maggie's Secrets -- On the Challenges of IP Theft Through FPGA Reverse Engineering 19 hours ago | arxiv.org

arrays arxiv challenges cs.cr +16

sec-certs: Examining the security certification practice for better vulnerability mitigation 19 hours ago | arxiv.org

arxiv certification certs cs.cr +7

Comparing AI Algorithms for Optimizing Elliptic Curve Cryptography Parameters in e-Commerce Integrations: A Pre-Quantum Analysis 19 hours ago | arxiv.org

ai algorithms algorithm algorithms analysis +22

Extracting Protocol Format as State Machine via Controlled Static Loop Analysis 19 hours ago | arxiv.org

analysis applications arxiv critical +24

M-to-N Backdoor Paradigm: A Multi-Trigger and Multi-Target Attack to Deep Learning Models 19 hours ago | arxiv.org

arxiv attack attacker attackers +14

Badllama 3: removing safety finetuning from Llama 3 in minutes 19 hours ago | arxiv.org

access art arxiv attacker +14

Ground Systems Engineer - Evolved Strategic SATCOM (ESS)

@ The Aerospace Corporation | Los Angeles AFB

View on infosec-jobs.com

Policy and Program Analyst

@ Obsidian Solutions Group | Rosslyn, VA, US

View on infosec-jobs.com

Principal Network Engineering

@ CVS Health | Work At Home-California

View on infosec-jobs.com

Lead Software Engineer

@ Rapid7 | NIS Belfast

View on infosec-jobs.com

Software Engineer II - Java

@ Rapid7 | NIS Belfast

View on infosec-jobs.com

Senior Software Engineer

@ Rapid7 | NIS Belfast

View on infosec-jobs.com