all InfoSec news
Adversarial Illusions in Multi-Modal Embeddings. (arXiv:2308.11804v2 [cs.CR] UPDATED)
cs.CR updates on arXiv.org arxiv.org
Multi-modal embeddings encode images, sounds, texts, videos, etc. into a
single embedding space, aligning representations across modalities (e.g.,
associate an image of a dog with a barking sound). We show that multi-modal
embeddings can be vulnerable to an attack we call "adversarial illusions."
Given an image or a sound, an adversary can perturb it so as to make its
embedding close to an arbitrary, adversary-chosen input in another modality.
This enables the adversary to align any image and any sound …
adversarial adversary attack call dog etc image images modal single sound space texts videos vulnerable