all InfoSec news
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
March 20, 2024, 4:11 a.m. | Jessica Quaye, Alicia Parrish, Oana Inel, Charvi Rastogi, Hannah Rose Kirk, Minsuk Kahng, Erin van Liemt, Max Bartolo, Jess Tsang, Justin White, Natha
cs.CR updates on arXiv.org arxiv.org
Abstract: With the rise of text-to-image (T2I) generative AI models reaching wide audiences, it is critical to evaluate model robustness against non-obvious attacks to mitigate the generation of offensive images. By focusing on ``implicitly adversarial'' prompts (those that trigger T2I models to generate unsafe images for non-obvious reasons), we isolate a set of difficult safety issues that human creativity is well-suited to uncover. To this end, we built the Adversarial Nibbler Challenge, a red-teaming methodology for …
adversarial ai models arxiv attacks critical cs.ai cs.cr cs.cv cs.cy cs.lg generative generative ai image image generation images non offensive prompts robustness text trigger
More from arxiv.org / cs.CR updates on arXiv.org
IDEA: Invariant Defense for Graph Adversarial Robustness
2 days, 13 hours ago |
arxiv.org
FairCMS: Cloud Media Sharing with Fair Copyright Protection
2 days, 13 hours ago |
arxiv.org
Efficient unitary designs and pseudorandom unitaries from permutations
2 days, 13 hours ago |
arxiv.org
Jobs in InfoSec / Cybersecurity
SOC 2 Manager, Audit and Certification
@ Deloitte | US and CA Multiple Locations
Lead Technical Product Manager - Threat Protection
@ Mastercard | Remote - United Kingdom
Data Privacy Officer
@ Banco Popular | San Juan, PR
GRC Security Program Manager
@ Meta | Bellevue, WA | Menlo Park, CA | Washington, DC | New York City
Cyber Security Engineer
@ ASSYSTEM | Warrington, United Kingdom
Privacy Engineer, Technical Audit
@ Meta | Menlo Park, CA