April 4, 2024, 4:10 a.m. | Yunzhuo Hao, Wenkai Yang, Yankai Lin

cs.CR updates on arXiv.org arxiv.org

arXiv:2404.02406v1 Announce Type: new
Abstract: Recent researches have shown that Large Language Models (LLMs) are susceptible to a security threat known as Backdoor Attack. The backdoored model will behave well in normal cases but exhibit malicious behaviours on inputs inserted with a specific backdoor trigger. Current backdoor studies on LLMs predominantly focus on instruction-tuned LLMs, while neglecting another realistic scenario where LLMs are fine-tuned on multi-turn conversational data to be chat models. Chat models are extensively adopted across various real-world …

arxiv attack backdoor backdoor attack cases chat cs.ai cs.cl cs.cr current focus inputs language language models large llms malicious normal security security threat studies threat trigger vulnerabilities

Azure DevSecOps Cloud Engineer II

@ Prudent Technology | McLean, VA, USA

Security Engineer III - Python, AWS

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

SOC Analyst (Threat Hunter)

@ NCS | Singapore, Singapore

Managed Services Information Security Manager

@ NTT DATA | Sydney, Australia

Senior Security Engineer (Remote)

@ Mattermost | United Kingdom

Penetration Tester (Part Time & Remote)

@ TestPros | United States - Remote