Feb. 26, 2024, 5:11 a.m. | Zefeng Wang, Zhen Han, Shuo Chen, Fan Xue, Zifeng Ding, Xun Xiao, Volker Tresp, Philip Torr, Jindong Gu

cs.CR updates on arXiv.org arxiv.org

arXiv:2402.14899v1 Announce Type: cross
Abstract: Recently, Multimodal LLMs (MLLMs) have shown a great ability to understand images. However, like traditional vision models, they are still vulnerable to adversarial images. Meanwhile, Chain-of-Thought (CoT) reasoning has been widely explored on MLLMs, which not only improves model's performance, but also enhances model's explainability by giving intermediate reasoning steps. Nevertheless, there is still a lack of study regarding MLLMs' adversarial robustness with CoT and an understanding of what the rationale looks like when MLLMs …

adversarial arxiv cs.ai cs.cr cs.cv cs.lg great images llms performance reasoning thought understand vulnerable

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Application Security Engineer - Enterprise Engineering

@ Meta | Bellevue, WA | Seattle, WA | New York City | Fremont, CA

Security Engineer

@ Retool | San Francisco, CA

Senior Product Security Analyst

@ Boeing | USA - Seattle, WA

Junior Governance, Risk and Compliance (GRC) and Operations Support Analyst

@ McKenzie Intelligence Services | United Kingdom - Remote

GRC Integrity Program Manager

@ Meta | Bellevue, WA | Menlo Park, CA | Washington, DC | New York City