March 12, 2024, 4:11 a.m. | Cristina Improta

cs.CR updates on arXiv.org arxiv.org

arXiv:2403.06675v1 Announce Type: new
Abstract: AI-based code generators have gained a fundamental role in assisting developers in writing software starting from natural language (NL). However, since these large language models are trained on massive volumes of data collected from unreliable online sources (e.g., GitHub, Hugging Face), AI models become an easy target for data poisoning attacks, in which an attacker corrupts the training data by injecting a small amount of poison into it, i.e., astutely crafted malicious samples. In this …

ai models arxiv code cs.ai cs.cr cs.se data developers generated github hugging face language language models large natural natural language poisoning role security security concerns software writing

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Open-Source Intelligence (OSINT) Policy Analyst (TS/SCI)

@ WWC Global | Reston, Virginia, United States

Security Architect (DevSecOps)

@ EUROPEAN DYNAMICS | Brussels, Brussels, Belgium

Infrastructure Security Architect

@ Ørsted | Kuala Lumpur, MY

Contract Penetration Tester

@ Evolve Security | United States - Remote

Senior Penetration Tester

@ DigitalOcean | Canada