April 3, 2024, 4:10 a.m. | Mark Russinovich, Ahmed Salem, Ronen Eldan

cs.CR updates on arXiv.org arxiv.org

arXiv:2404.01833v1 Announce Type: new
Abstract: Large Language Models (LLMs) have risen significantly in popularity and are increasingly being adopted across multiple applications. These LLMs are heavily aligned to resist engaging in illegal or unethical topics as a means to avoid contributing to responsible AI harms. However, a recent line of attacks, known as "jailbreaks", seek to overcome this alignment. Intuitively, jailbreak attacks aim to narrow the gap between what the model can do and what it is willing to do. …

applications article arxiv attack cs.ai cs.cr great illegal jailbreak language language models large llm llms responsible responsible ai topics turn

Social Engineer For Reverse Engineering Exploit Study

@ Independent study | Remote

Information Security Engineer, Sr. (Container Hardening)

@ Rackner | San Antonio, TX

BaaN IV Techno-functional consultant-On-Balfour

@ Marlabs | Piscataway, US

Senior Security Analyst

@ BETSOL | Bengaluru, India

Security Operations Centre Operator

@ NEXTDC | West Footscray, Australia

Senior Network and Security Research Officer

@ University of Toronto | Toronto, ON, CA