April 4, 2024, 4:11 a.m. | Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin

cs.CR updates on arXiv.org arxiv.org

arXiv:2307.16888v3 Announce Type: replace-cross
Abstract: Instruction-tuned Large Language Models (LLMs) have become a ubiquitous platform for open-ended applications due to their ability to modulate responses based on human instructions. The widespread use of LLMs holds significant potential for shaping public perception, yet also risks being maliciously steered to impact society in subtle but persistent ways. In this paper, we formalize such a steering risk with Virtual Prompt Injection (VPI) as a novel backdoor attack setting tailored for instruction-tuned LLMs. In …

arxiv backdooring cs.cl cs.cr cs.lg injection language language models large prompt prompt injection virtual

Security Operations Engineer

@ Nokia | India

Machine Learning DevSecOps Engineer

@ Ford Motor Company | Mexico City, MEX, Mexico

Cybersecurity Defense Analyst 2

@ IDEMIA | Casablanca, MA, 20270

Executive, IT Security

@ CIMB | Cambodia

Cloud Security Architect - Microsoft (m/w/d)

@ Bertelsmann | Gütersloh, NW, DE, 33333

Senior Consultant, Cybersecurity - SOC

@ NielsenIQ | Chennai, India