Jan. 1, 2024, 2:10 a.m. | Julien Piet, Maha Alrashed, Chawin Sitawarin, Sizhe Chen, Zeming Wei, Elizabeth Sun, Basel Alomair, David Wagner

cs.CR updates on arXiv.org arxiv.org

Large Language Models (LLMs) are attracting significant research attention
due to their instruction-following abilities, allowing users and developers to
leverage LLMs for a variety of tasks. However, LLMs are vulnerable to
prompt-injection attacks: a class of attacks that hijack the model's
instruction-following abilities, changing responses to prompts to undesired,
possibly malicious ones. In this work, we introduce Jatmo, a method for
generating task-specific models resilient to prompt-injection attacks. Jatmo
leverages the fact that LLMs can only follow instructions once they …

attacks attention changing class defense developers finetuning hijack injection injection attacks language language models large llms malicious prompt prompt injection prompts research task vulnerable

Senior Cyber Security Analyst

@ Valley Water | San Jose, CA

Grp 59 - Cyber System Exploitation CO-OP (July-December, 2024)

@ MIT Lincoln Laboratory | Lexington, MA, US

SecOps Transformation Advisor

@ Palo Alto Networks | Santa Clara, CA, United States

Cybersecurity Editor

@ Launch Potato | Halifax, Canada (remote)

Security Consultant

@ LRQA | Singapore, Singapore, SG, 119963

Senior Security Engineer

@ Splash | Canada (Remote in Eastern or Central time zones)