all InfoSec news
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models
Feb. 27, 2024, 5:11 a.m. | Huijie Lv, Xiao Wang, Yuansen Zhang, Caishuang Huang, Shihan Dou, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang
cs.CR updates on arXiv.org arxiv.org
Abstract: Adversarial misuse, particularly through `jailbreaking' that circumvents a model's safety and ethical protocols, poses a significant challenge for Large Language Models (LLMs). This paper delves into the mechanisms behind such successful attacks, introducing a hypothesis for the safety mechanism of aligned LLMs: intent security recognition followed by response generation. Grounded in this hypothesis, we propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics. To elude the intent security recognition phase, we reformulate tasks …
adversarial arxiv attacks challenge cs.ai cs.cl cs.cr encryption ethical framework intent jailbreaking language language models large llms mechanism protocols recognition safety security
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Social Engineer For Reverse Engineering Exploit Study
@ Independent study | Remote
Security Engineer II- Full stack Java with React
@ JPMorgan Chase & Co. | Hyderabad, Telangana, India
Cybersecurity SecOps
@ GFT Technologies | Mexico City, MX, 11850
Senior Information Security Advisor
@ Sun Life | Sun Life Toronto One York
Contract Special Security Officer (CSSO) - Top Secret Clearance
@ SpaceX | Hawthorne, CA
Early Career Cyber Security Operations Center (SOC) Analyst
@ State Street | Quincy, Massachusetts