Web: http://arxiv.org/abs/2111.07970

April 28, 2022, 1:20 a.m. | Leilei Gan, Jiwei Li, Tianwei Zhang, Xiaoya Li, Yuxian Meng, Fei Wu, Yi Yang, Shangwei Guo, Chun Fan

cs.CR updates on arXiv.org arxiv.org

Backdoor attacks pose a new threat to NLP models. A standard strategy to
construct poisoned data in backdoor attacks is to insert triggers (e.g., rare
words) into selected sentences and alter the original label to a target label.
This strategy comes with a severe flaw of being easily detected from both the
trigger and the label perspectives: the trigger injected, which is usually a
rare word, leads to an abnormal natural language expression, and thus can be
easily detected by …

attack backdoor nlp

