all InfoSec news
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
April 3, 2024, 4:10 a.m. | Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George
cs.CR updates on arXiv.org arxiv.org
Abstract: Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and success rates in incomparable ways. And third, numerous works are not reproducible, as they withhold adversarial prompts, involve closed-source code, or rely …
address arxiv attacks benchmark benchmarks challenges clear collection cs.cr cs.lg current evaluation jailbreak jailbreaking language language models large llms practice robustness standard techniques
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Security Specialist
@ Nestlé | St. Louis, MO, US, 63164
Cybersecurity Analyst
@ Dana Incorporated | Pune, MH, IN, 411057
Sr. Application Security Engineer
@ CyberCube | United States
Linux DevSecOps Administrator (Remote)
@ Accenture Federal Services | Arlington, VA
Cyber Security Intern or Co-op
@ Langan | Parsippany, NJ, US, 07054-2172
Security Advocate - Application Security
@ Datadog | New York, USA, Remote