BadEdit: Backdooring large language models by model editing | allinfosecnews.com

March 21, 2024, 4:10 a.m. | Yanzhou Li, Tianlin Li, Kangjie Chen, Jian Zhang, Shangqing Liu, Wenhan Wang, Tianwei Zhang, Yang Liu

cs.CR updates on arXiv.org arxiv.org

arXiv:2403.13355v1 Announce Type: new
Abstract: Mainstream backdoor attack methods typically demand substantial tuning data for poisoning, limiting their practicality and potentially degrading the overall performance when applied to Large Language Models (LLMs). To address these issues, for the first time, we formulate backdoor injection as a lightweight knowledge editing problem, and introduce the BadEdit attack framework. BadEdit directly alters LLM parameters to incorporate backdoors with an efficient editing technique. It boasts superiority over existing backdoor injection techniques in several areas: …

address arxiv attack backdoor backdoor attack backdooring cs.ai cs.cr data demand editing injection knowledge language language models large llms mainstream performance poisoning problem

More from arxiv.org / cs.CR updates on arXiv.org

Lightweight and Scalable Post-Quantum Authentication for Medical Internet of Things 3 hours ago | arxiv.org

analysis arxiv authentication collect +27

DYST (Did You See That?): An Amplified Covert Channel That Points To Previously Seen Data 3 hours ago | arxiv.org

adversary arxiv call channel +18

Sandboxing Adoption in Open Source Ecosystems 3 hours ago | arxiv.org

access adoption applications arxiv +14

DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation 3 hours ago | arxiv.org

adaptation addresses arxiv can +29

Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach 3 hours ago | arxiv.org

adversarial adversarial attacks applications arxiv +15

Disttack: Graph Adversarial Attacks Toward Distributed GNN Training 3 hours ago | arxiv.org

address adversarial adversarial attack adversarial attacks +21

Anomaly Detection in Graph Structured Data: A Survey 3 hours ago | arxiv.org

analysis anomaly detection arxiv cs.cr +14

Quantum Secure Anonymous Communication Networks 3 hours ago | arxiv.org

advertisers a network anonymous arxiv +18

Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search 3 hours ago | arxiv.org

architecture architectures arxiv attack +20

Digital Security Infrastructure Manager

@ Wizz Air | Budapest, HU, H-1103

View on infosec-jobs.com

Sr. Solution Consultant

@ Highspot | Sydney

View on infosec-jobs.com

Cyber Security Analyst III

@ Love's Travel Stops | Oklahoma City, OK, US, 73120

View on infosec-jobs.com

Lead Security Engineer

@ JPMorgan Chase & Co. | Tampa, FL, United States

View on infosec-jobs.com

GTI Manager of Cybersecurity Operations

@ Grant Thornton | Tulsa, OK, United States

View on infosec-jobs.com

GCP Incident Response Engineer

@ Publicis Groupe | Dallas, Texas, United States

View on infosec-jobs.com