March 24, 2022, 1:20 a.m. | Emil Biju, Anirudh Sriram, Pratyush Kumar, Mitesh M Khapra

cs.CR updates on arXiv.org arxiv.org

Self-attention heads are characteristic of Transformer models and have been
well studied for interpretability and pruning. In this work, we demonstrate an
altogether different utility of attention heads, namely for adversarial
detection. Specifically, we propose a method to construct input-specific
attention subnetworks (IAS) from which we extract three features to
discriminate between authentic and adversarial inputs. The resultant detector
significantly improves (by over 7.5%) the state-of-the-art adversarial
detection accuracy for the BERT encoder on 10 NLU datasets with 11 different …

adversarial attention detection input

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Information Security Engineers

@ D. E. Shaw Research | New York City

Cybersecurity Consultant- Governance, Risk, and Compliance team

@ EY | Tel Aviv, IL, 6706703

Professional Services Consultant

@ Zscaler | Escazú, Costa Rica

IT Security Analyst

@ Briggs & Stratton | Wauwatosa, WI, US, 53222

Cloud DevSecOps Engineer - Team Lead

@ Motorola Solutions | Krakow, Poland