April 16, 2024, 4:11 a.m. | Dongyu Yao, Jianshu Zhang, Ian G. Harris, Marcel Carlsson

cs.CR updates on arXiv.org arxiv.org

arXiv:2309.05274v2 Announce Type: replace
Abstract: Jailbreak vulnerabilities in Large Language Models (LLMs), which exploit meticulously crafted prompts to elicit content that violates service guidelines, have captured the attention of research communities. While model owners can defend against individual jailbreak prompts through safety training strategies, this relatively passive approach struggles to handle the broader category of similar jailbreaks. To tackle this issue, we introduce FuzzLLM, an automated fuzzing framework designed to proactively test and discover jailbreak vulnerabilities in LLMs. We utilize …

arxiv attention can communities cs.cr exploit framework fuzzing guidelines jailbreak language language models large llms novel prompts research safety service strategies training vulnerabilities

Social Engineer For Reverse Engineering Exploit Study

@ Independent study | Remote

Security Engineer II- Full stack Java with React

@ JPMorgan Chase & Co. | Hyderabad, Telangana, India

Cybersecurity SecOps

@ GFT Technologies | Mexico City, MX, 11850

Senior Information Security Advisor

@ Sun Life | Sun Life Toronto One York

Contract Special Security Officer (CSSO) - Top Secret Clearance

@ SpaceX | Hawthorne, CA

Early Career Cyber Security Operations Center (SOC) Analyst

@ State Street | Quincy, Massachusetts