March 26, 2024, 5:04 p.m. | Black Hat

Black Hat www.youtube.com

Multi-modal Large Language Models (LLMs) are advanced artificial intelligence models that can produce contextually rich responses that combine inputs of various types (text, audio, pictures). As a result, Bard already relies on such architecture, and the next generation of ChatGPT is expected to rely on them as well.

In this talk, we demonstrate how images and audio samples can be used for indirect prompt and instruction injection against (unmodified and benign) multi-modal LLMs. An attacker generates an adversarial perturbation corresponding …

advanced architecture artificial artificial intelligence audio bard can chatgpt images injection inputs intelligence language language models large llms modal next pictures prompt prompt injection result text types

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Data & Security Engineer Lead

@ LiquidX | Singapore, Central Singapore, Singapore

IT and Cyber Risk Control Lead

@ GXS Bank | Singapore - OneNorth

Consultant Senior en Gestion de Crise Cyber et Continuité d’Activité H/F

@ Hifield | Sèvres, France

Cyber Security Analyst (Weekend 1st Shift)

@ Fortress Security Risk Management | Cleveland, OH, United States

Senior Manager, Cybersecurity

@ BlueTriton Brands | Stamford, CT, US