all InfoSec news
On Trojan Signatures in Large Language Models of Code
Feb. 28, 2024, 5:11 a.m. | Aftab Hussain, Md Rafiqul Islam Rabin, Mohammad Amin Alipour
cs.CR updates on arXiv.org arxiv.org
Abstract: Trojan signatures, as described by Fields et al. (2021), are noticeable differences in the distribution of the trojaned class parameters (weights) and the non-trojaned class parameters of the trojaned model, that can be used to detect the trojaned model. Fields et al. (2021) found trojan signatures in computer vision classification tasks with image models, such as, Resnet, WideResnet, Densenet, and VGG. In this paper, we investigate such signatures in the classifier layer parameters of large …
arxiv can class code cs.cr cs.lg cs.se detect distribution found language language models large non signatures trojan
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
PMO Cybersécurité H/F
@ Hifield | Sèvres, France
Third Party Risk Management - Consultant
@ KPMG India | Bengaluru, Karnataka, India
Consultant Cyber Sécurité H/F - Strasbourg
@ Hifield | Strasbourg, France
Information Security Compliance Analyst
@ KPMG Australia | Melbourne, Australia
GDS Consulting - Cyber Security | Data Protection Senior Consultant
@ EY | Taguig, PH, 1634
Senior QA Engineer - Cloud Security
@ Tenable | Israel