Jan. 8, 2024, 2:10 a.m. | Renjie Pi, Tianyang Han, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang

cs.CR updates on arXiv.org arxiv.org

The deployment of multimodal large language models (MLLMs) has brought forth
a unique vulnerability: susceptibility to malicious attacks through visual
inputs. We delve into the novel challenge of defending MLLMs against such
attacks. We discovered that images act as a "foreign language" that is not
considered during alignment, which can make MLLMs prone to producing harmful
responses. Unfortunately, unlike the discrete tokens considered in text-based
LLMs, the continuous nature of image signals presents significant alignment
challenges, which poses difficulty to …

act alignment attacks challenge defending deployment images inputs language language models large malicious novel performance safety vulnerability

Senior Security Engineer - Detection and Response

@ Fastly, Inc. | US (Remote)

Application Security Engineer

@ Solidigm | Zapopan, Mexico

Defensive Cyber Operations Engineer-Mid

@ ISYS Technologies | Aurora, CO, United States

Manager, Information Security GRC

@ OneTrust | Atlanta, Georgia

Senior Information Security Analyst | IAM

@ EBANX | Curitiba or São Paulo

Senior Information Security Engineer, Cloud Vulnerability Research

@ Google | New York City, USA; New York, USA