all InfoSec news
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
Feb. 15, 2024, 5:10 a.m. | Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
cs.CR updates on arXiv.org arxiv.org
Abstract: As large language models (LLMs) become increasingly integrated into real-world applications such as code generation and chatbot assistance, extensive efforts have been made to align LLM behavior with human values, including safety. Jailbreak attacks, aiming to provoke unintended and unsafe behaviors from LLMs, remain a significant/leading LLM safety threat. In this paper, we aim to defend LLMs against jailbreak attacks by introducing SafeDecoding, a safety-aware decoding strategy for LLMs to generate helpful and harmless responses …
applications arxiv assistance attacks aware chatbot code cs.ai cs.cr decoding defending human human values jailbreak language language models large llm llms real safety world
More from arxiv.org / cs.CR updates on arXiv.org
IDEA: Invariant Defense for Graph Adversarial Robustness
2 days, 7 hours ago |
arxiv.org
FairCMS: Cloud Media Sharing with Fair Copyright Protection
2 days, 7 hours ago |
arxiv.org
Efficient unitary designs and pseudorandom unitaries from permutations
2 days, 7 hours ago |
arxiv.org
Jobs in InfoSec / Cybersecurity
SOC 2 Manager, Audit and Certification
@ Deloitte | US and CA Multiple Locations
Security Officer Hospital Laguna Beach
@ Allied Universal | Laguna Beach, CA, United States
Sr. Cloud DevSecOps Engineer
@ Oracle | NOIDA, UTTAR PRADESH, India
Cloud Operations Security Engineer
@ Elekta | Crawley - Cornerstone
Cybersecurity – Senior Information System Security Manager (ISSM)
@ Boeing | USA - Seal Beach, CA
Engineering -- Tech Risk -- Security Architecture -- VP -- Dallas
@ Goldman Sachs | Dallas, Texas, United States