Concerned with Data Contamination? Assessing Countermeasures in Code Language Model | allinfosecnews.com

March 26, 2024, 4:10 a.m. | Jialun Cao, Wuqi Zhang, Shing-Chi Cheung

cs.CR updates on arXiv.org arxiv.org

arXiv:2403.16898v1 Announce Type: new
Abstract: Various techniques have been proposed to leverage the capabilities of code language models (CLMs) for SE tasks. While these techniques typically evaluate their effectiveness using publicly available datasets, the evaluation can be subject to data contamination threats where the evaluation datasets have already been used to train the concerned CLMs. This can significantly affect the reliability of the evaluation. Different countermeasures have been suggested to mitigate the data contamination threat. Countermeasures include using more recent …

arxiv can capabilities code countermeasures cs.cr cs.se data datasets evaluation language language models techniques threats

More from arxiv.org / cs.CR updates on arXiv.org

Differentially private Bayesian tests 12 hours ago | arxiv.org

arxiv confidential cornerstone cs.cr +16

On the Learnability of Watermarks for Language Models 12 hours ago | arxiv.org

arxiv ask can cs.cl +12

Intriguing Properties of Diffusion Models: An Empirical Study of the Natural Attack Capability in Text-to-Image … 12 hours ago | arxiv.org

applications arxiv attack cs.cr +14

On the Reliability of Watermarks for Large Language Models 12 hours ago | arxiv.org

arxiv bots cs.cl cs.cr +23

A Watermark for Large Language Models 12 hours ago | arxiv.org

arxiv can cs.cl cs.cr +13

Asymmetric Distributed Trust 12 hours ago | arxiv.org

abstraction algorithms arxiv can +12

Read Disturbance in High Bandwidth Memory: A Detailed Experimental Study on HBM2 DRAM Chips 12 hours ago | arxiv.org

arxiv bandwidth chips cs.ar +5

ABACuS: All-Bank Activation Counters for Scalable and Low Overhead RowHammer Mitigation 12 hours ago | arxiv.org

access address area arxiv +17

A Case Study of Large Language Models (ChatGPT and CodeBERT) for Security-Oriented Code Analysis 12 hours ago | arxiv.org

analysis arxiv can capabilities +17

Social Engineer For Reverse Engineering Exploit Study

@ Independent study | Remote

View on infosec-jobs.com

Intern, Cyber Security Vulnerability Management

@ Grab | Petaling Jaya, Malaysia

View on infosec-jobs.com

Compliance - Global Privacy Office - Associate - Bengaluru

@ Goldman Sachs | Bengaluru, Karnataka, India

View on infosec-jobs.com

Cyber Security Engineer (m/w/d) Operational Technology

@ MAN Energy Solutions | Oberhausen, DE, 46145

View on infosec-jobs.com

Armed Security Officer - Hospital

@ Allied Universal | Sun Valley, CA, United States

View on infosec-jobs.com

Governance, Risk and Compliance Officer (Africa)

@ dLocal | Lagos (Remote)

View on infosec-jobs.com