Hidden You Malicious Goal Into Benigh Narratives: Jailbreak Large Language Models through Logic Chain Injection | allinfosecnews.com

April 9, 2024, 4:11 a.m. | Zhilong Wang, Yebo Cao, Peng Liu

cs.CR updates on arXiv.org arxiv.org

arXiv:2404.04849v1 Announce Type: new
Abstract: Jailbreak attacks on Language Model Models (LLMs) entail crafting prompts aimed at exploiting the models to generate malicious content. Existing jailbreak attacks can successfully deceive the LLMs, however they cannot deceive the human. This paper proposes a new type of jailbreak attacks which can deceive both the LLMs and human (i.e., security analyst). The key insight of our idea is borrowed from the social psychology - that is human are easily deceived if the lie …

arxiv attacks can cs.ai cs.cr exploiting goal hidden human injection jailbreak language language models large llms logic malicious prompts

More from arxiv.org / cs.CR updates on arXiv.org

David and Goliath: An Empirical Evaluation of Attacks and Defenses for QNNs at the Deep … 26 minutes ago | arxiv.org

applications arm arxiv attacks +18

How to Use Quantum Indistinguishability Obfuscation 26 minutes ago | arxiv.org

arxiv class copy cs.cr +6

Privately Aligning Language Models with Reinforcement Learning 26 minutes ago | arxiv.org

alignment arxiv chatgpt cs.cr +13

Numeric Truncation Security Predicate 26 minutes ago | arxiv.org

arxiv bits conversion cs.cr +11

Causal Discovery Under Local Privacy 26 minutes ago | arxiv.org

application arxiv consumers cs.ai +19

Investigating Threats Posed by SMS Origin Spoofing to IoT Devices 26 minutes ago | arxiv.org

arxiv communication cs.cr devices +18

Impact of Architectural Modifications on Deep Learning Adversarial Robustness 26 minutes ago | arxiv.org

adoption advancements adversarial applications +23

Tokenization of Real Estate Assets Using Blockchain 26 minutes ago | arxiv.org

area arxiv assets banking +20

A Survey on Privacy-Preserving Caching at Network Edge: Classification, Solutions, and Challenges 26 minutes ago | arxiv.org

arxiv caching challenges classification +12

Cyber Security Engineer

@ ASSYSTEM | Bridgwater, United Kingdom

View on infosec-jobs.com

Security Analyst

@ Northwestern Memorial Healthcare | Chicago, IL, United States

View on infosec-jobs.com

GRC Analyst

@ Richemont | Shelton, CT, US

View on infosec-jobs.com

Security Specialist

@ Peraton | Government Site, MD, United States

View on infosec-jobs.com

Information Assurance Security Specialist (IASS)

@ OBXtek Inc. | United States

View on infosec-jobs.com

Cyber Security Technology Analyst

@ Airbus | Bengaluru (Airbus)

View on infosec-jobs.com