all InfoSec news
Subtoxic Questions: Dive Into Attitude Change of LLM's Response in Jailbreak Attempts
April 15, 2024, 4:10 a.m. | Tianyu Zhang, Zixuan Zhao, Jiaqi Huang, Jingyu Hua, Sheng Zhong
cs.CR updates on arXiv.org arxiv.org
Abstract: As Large Language Models (LLMs) of Prompt Jailbreaking are getting more and more attention, it is of great significance to raise a generalized research paradigm to evaluate attack strengths and a basic model to conduct subtler experiments. In this paper, we propose a novel approach by focusing on a set of target questions that are inherently more sensitive to jailbreak prompts, aiming to circumvent the limitations posed by enhanced LLM security. Through designing and analyzing …
arxiv attack attention basic change cs.ai cs.cl cs.cr dive great jailbreak jailbreaking language language models large llm llms paradigm prompt questions research response
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Social Engineer For Reverse Engineering Exploit Study
@ Independent study | Remote
Information Security Specialist, Sr. (Container Hardening)
@ Rackner | San Antonio, TX
Principal Security Researcher (Advanced Threat Prevention)
@ Palo Alto Networks | Santa Clara, CA, United States
EWT Infosec | IAM Technical Security Consultant - Manager
@ KPMG India | Bengaluru, Karnataka, India
Security Engineering Operations Manager
@ Gusto | San Francisco, CA; Denver, CO; Remote
Network Threat Detection Engineer
@ Meta | Denver, CO | Reston, VA | Menlo Park, CA | Washington, DC