all InfoSec news
Concerned with Data Contamination? Assessing Countermeasures in Code Language Model
March 26, 2024, 4:10 a.m. | Jialun Cao, Wuqi Zhang, Shing-Chi Cheung
cs.CR updates on arXiv.org arxiv.org
Abstract: Various techniques have been proposed to leverage the capabilities of code language models (CLMs) for SE tasks. While these techniques typically evaluate their effectiveness using publicly available datasets, the evaluation can be subject to data contamination threats where the evaluation datasets have already been used to train the concerned CLMs. This can significantly affect the reliability of the evaluation. Different countermeasures have been suggested to mitigate the data contamination threat. Countermeasures include using more recent …
arxiv can capabilities code countermeasures cs.cr cs.se data datasets evaluation language language models techniques threats
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Social Engineer For Reverse Engineering Exploit Study
@ Independent study | Remote
Intern, Cyber Security Vulnerability Management
@ Grab | Petaling Jaya, Malaysia
Compliance - Global Privacy Office - Associate - Bengaluru
@ Goldman Sachs | Bengaluru, Karnataka, India
Cyber Security Engineer (m/w/d) Operational Technology
@ MAN Energy Solutions | Oberhausen, DE, 46145
Armed Security Officer - Hospital
@ Allied Universal | Sun Valley, CA, United States
Governance, Risk and Compliance Officer (Africa)
@ dLocal | Lagos (Remote)