all InfoSec news
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance. (arXiv:2401.02906v1 [cs.CR])
cs.CR updates on arXiv.org arxiv.org
The deployment of multimodal large language models (MLLMs) has brought forth
a unique vulnerability: susceptibility to malicious attacks through visual
inputs. We delve into the novel challenge of defending MLLMs against such
attacks. We discovered that images act as a "foreign language" that is not
considered during alignment, which can make MLLMs prone to producing harmful
responses. Unfortunately, unlike the discrete tokens considered in text-based
LLMs, the continuous nature of image signals presents significant alignment
challenges, which poses difficulty to …
act alignment attacks challenge defending deployment images inputs language language models large malicious novel performance safety vulnerability