all InfoSec news
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
Feb. 15, 2024, 5:10 a.m. | Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
cs.CR updates on arXiv.org arxiv.org
Abstract: As large language models (LLMs) become increasingly integrated into real-world applications such as code generation and chatbot assistance, extensive efforts have been made to align LLM behavior with human values, including safety. Jailbreak attacks, aiming to provoke unintended and unsafe behaviors from LLMs, remain a significant/leading LLM safety threat. In this paper, we aim to defend LLMs against jailbreak attacks by introducing SafeDecoding, a safety-aware decoding strategy for LLMs to generate helpful and harmless responses …
applications arxiv assistance attacks aware chatbot code cs.ai cs.cr decoding defending human human values jailbreak language language models large llm llms real safety world
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
CyberSOC Technical Lead
@ Integrity360 | Sandyford, Dublin, Ireland
Cyber Security Strategy Consultant
@ Capco | New York City
Cyber Security Senior Consultant
@ Capco | Chicago, IL
Sr. Product Manager
@ MixMode | Remote, US
Corporate Intern - Information Security (Year Round)
@ Associated Bank | US WI Remote
Senior Offensive Security Engineer
@ CoStar Group | US-DC Washington, DC