Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models | allinfosecnews.com

March 27, 2024, 4:11 a.m. | Zhiyuan Yu, Xiaogeng Liu, Shunning Liang, Zach Cameron, Chaowei Xiao, Ning Zhang

cs.CR updates on arXiv.org arxiv.org

arXiv:2403.17336v1 Announce Type: new
Abstract: Recent advancements in generative AI have enabled ubiquitous access to large language models (LLMs). Empowered by their exceptional capabilities to understand and generate human-like text, these models are being increasingly integrated into our society. At the same time, there are also concerns on the potential misuse of this powerful technology, prompting defensive measures from service providers. To overcome such protection, jailbreaking prompts have recently emerged as one of the most effective mechanisms to circumvent security …

access arxiv capabilities cs.cl cs.cr don empowered generative generative ai human jailbreak language language models large llms prompts society text understand understanding

More from arxiv.org / cs.CR updates on arXiv.org

IDEA: Invariant Defense for Graph Adversarial Robustness 2 days, 13 hours ago | arxiv.org

adversarial arxiv cs.cr cs.lg +4

BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting 2 days, 13 hours ago | arxiv.org

art arxiv attackers attacks +19

ZTD$_{JAVA}$: Mitigating Software Supply Chain Vulnerabilities via Zero-Trust Dependencies 2 days, 13 hours ago | arxiv.org

accelerate application application development arxiv +26

A Generative Framework for Low-Cost Result Validation of Machine Learning-as-a-Service Inference 2 days, 13 hours ago | arxiv.org

applications arxiv as-a-service cost +23

PA-Boot: A Formally Verified Authentication Protocol for Multiprocessor Secure Boot 2 days, 13 hours ago | arxiv.org

arxiv attack attacks attack surface +18

FairCMS: Cloud Media Sharing with Fair Copyright Protection 2 days, 13 hours ago | arxiv.org

arxiv cloud cloud platform copyright +16

Efficient unitary designs and pseudorandom unitaries from permutations 2 days, 13 hours ago | arxiv.org

algorithm arxiv construction cs.cr +9

Efficient and Near-Optimal Noise Generation for Streaming Differential Privacy 2 days, 13 hours ago | arxiv.org

arxiv cs.cc cs.cr cs.ds +11

Privacy-Preserving Statistical Data Generation: Application to Sepsis Detection 2 days, 13 hours ago | arxiv.org

application artificial artificial intelligence arxiv +19

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

View on infosec-jobs.com

Lead Technical Product Manager - Threat Protection

@ Mastercard | Remote - United Kingdom

View on infosec-jobs.com

Data Privacy Officer

@ Banco Popular | San Juan, PR

View on infosec-jobs.com

GRC Security Program Manager

@ Meta | Bellevue, WA | Menlo Park, CA | Washington, DC | New York City

View on infosec-jobs.com

Cyber Security Engineer

@ ASSYSTEM | Warrington, United Kingdom

View on infosec-jobs.com

Privacy Engineer, Technical Audit

@ Meta | Menlo Park, CA

View on infosec-jobs.com