Feb. 9, 2024, 5:10 a.m. | Junjie Chu Yugeng Liu Ziqing Yang Xinyue Shen Michael Backes Yang Zhang

cs.CR updates on arXiv.org arxiv.org

Misuse of the Large Language Models (LLMs) has raised widespread concern. To address this issue, safeguards have been taken to ensure that LLMs align with social ethics. However, recent findings have revealed an unsettling vulnerability bypassing the safeguards of LLMs, known as jailbreak attacks. By applying techniques, such as employing role-playing scenarios, adversarial examples, or subtle subversion of safety objectives as a prompt, LLMs can produce an inappropriate or even harmful response. While researchers have studied several categories of jailbreak …

address adversarial assessment attacks bypassing cs.ai cs.cl cs.cr cs.lg ethics findings issue jailbreak language language models large llms role safeguards social taken techniques vulnerability

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Senior Security Researcher, SIEM

@ Huntress | Remote Canada

Senior Application Security Engineer

@ Revinate | San Francisco Bay Area

Cyber Security Manager

@ American Express Global Business Travel | United States - New York - Virtual Location

Incident Responder Intern

@ Bentley Systems | Remote, PA, US

SC2024-003533 Senior Online Vulnerability Assessment Analyst (CTS) - THU 9 May

@ EMW, Inc. | Mons, Wallonia, Belgium