Feb. 14, 2024, 5:10 a.m. | Xiangming Gu Xiaosen Zheng Tianyu Pang Chao Du Qian Liu Ye Wang Jing Jiang Min Lin

cs.CR updates on arXiv.org arxiv.org

A multimodal large language model (MLLM) agent can receive instructions, capture images, retrieve histories from memory, and decide which tools to use. Nonetheless, red-teaming efforts have revealed that adversarial images/prompts can jailbreak an MLLM and cause unaligned behaviors. In this work, we report an even more severe safety issue in multi-agent environments, referred to as infectious jailbreak. It entails the adversary simply jailbreaking a single agent, and without any further intervention from the adversary, (almost) all agents will become infected …

cs.cl cs.cr cs.cv cs.lg cs.ma

Information Security Engineers

@ D. E. Shaw Research | New York City

Technology Security Analyst

@ Halton Region | Oakville, Ontario, Canada

Senior Cyber Security Analyst

@ Valley Water | San Jose, CA

Computer and Forensics Investigator

@ ManTech | 221BQ - Cstmr Site,Springfield,VA

Senior Security Analyst

@ Oracle | United States

Associate Vulnerability Management Specialist

@ Diebold Nixdorf | Hyderabad, Telangana, India