Jan. 8, 2024, 2:10 a.m. | Renjie Pi, Tianyang Han, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang

cs.CR updates on arXiv.org arxiv.org

The deployment of multimodal large language models (MLLMs) has brought forth
a unique vulnerability: susceptibility to malicious attacks through visual
inputs. We delve into the novel challenge of defending MLLMs against such
attacks. We discovered that images act as a "foreign language" that is not
considered during alignment, which can make MLLMs prone to producing harmful
responses. Unfortunately, unlike the discrete tokens considered in text-based
LLMs, the continuous nature of image signals presents significant alignment
challenges, which poses difficulty to …

act alignment attacks challenge defending deployment images inputs language language models large malicious novel performance safety vulnerability

CyberSOC Technical Lead

@ Integrity360 | Sandyford, Dublin, Ireland

Cyber Security Strategy Consultant

@ Capco | New York City

Cyber Security Senior Consultant

@ Capco | Chicago, IL

Sr. Product Manager

@ MixMode | Remote, US

Corporate Intern - Information Security (Year Round)

@ Associated Bank | US WI Remote

Senior Offensive Security Engineer

@ CoStar Group | US-DC Washington, DC