all InfoSec news
Hidden You Malicious Goal Into Benigh Narratives: Jailbreak Large Language Models through Logic Chain Injection
April 9, 2024, 4:11 a.m. | Zhilong Wang, Yebo Cao, Peng Liu
cs.CR updates on arXiv.org arxiv.org
Abstract: Jailbreak attacks on Language Model Models (LLMs) entail crafting prompts aimed at exploiting the models to generate malicious content. Existing jailbreak attacks can successfully deceive the LLMs, however they cannot deceive the human. This paper proposes a new type of jailbreak attacks which can deceive both the LLMs and human (i.e., security analyst). The key insight of our idea is borrowed from the social psychology - that is human are easily deceived if the lie …
arxiv attacks can cs.ai cs.cr exploiting goal hidden human injection jailbreak language language models large llms logic malicious prompts
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Cyber Security Engineer
@ ASSYSTEM | Bridgwater, United Kingdom
Security Analyst
@ Northwestern Memorial Healthcare | Chicago, IL, United States
GRC Analyst
@ Richemont | Shelton, CT, US
Security Specialist
@ Peraton | Government Site, MD, United States
Information Assurance Security Specialist (IASS)
@ OBXtek Inc. | United States
Cyber Security Technology Analyst
@ Airbus | Bengaluru (Airbus)