March 20, 2024, 4:11 a.m. | Jessica Quaye, Alicia Parrish, Oana Inel, Charvi Rastogi, Hannah Rose Kirk, Minsuk Kahng, Erin van Liemt, Max Bartolo, Jess Tsang, Justin White, Natha

cs.CR updates on arXiv.org arxiv.org

arXiv:2403.12075v1 Announce Type: cross
Abstract: With the rise of text-to-image (T2I) generative AI models reaching wide audiences, it is critical to evaluate model robustness against non-obvious attacks to mitigate the generation of offensive images. By focusing on ``implicitly adversarial'' prompts (those that trigger T2I models to generate unsafe images for non-obvious reasons), we isolate a set of difficult safety issues that human creativity is well-suited to uncover. To this end, we built the Adversarial Nibbler Challenge, a red-teaming methodology for …

adversarial ai models arxiv attacks critical cs.ai cs.cr cs.cv cs.cy cs.lg generative generative ai image image generation images non offensive prompts robustness text trigger

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Lead Technical Product Manager - Threat Protection

@ Mastercard | Remote - United Kingdom

Data Privacy Officer

@ Banco Popular | San Juan, PR

GRC Security Program Manager

@ Meta | Bellevue, WA | Menlo Park, CA | Washington, DC | New York City

Cyber Security Engineer

@ ASSYSTEM | Warrington, United Kingdom

Privacy Engineer, Technical Audit

@ Meta | Menlo Park, CA