Feb. 12, 2024, 5:10 a.m. | Domenico Cotroneo Cristina Improta Pietro Liguori Roberto Natella

cs.CR updates on arXiv.org arxiv.org

AI-based code generators have become pivotal in assisting developers in writing software starting from natural language (NL). However, they are trained on large amounts of data, often collected from unsanitized online sources (e.g., GitHub, HuggingFace). As a consequence, AI models become an easy target for data poisoning, i.e., an attack that injects malicious samples into the training data to generate vulnerable code.
To address this threat, this work investigates the security of AI code generators by devising a targeted data …

ai models attack attacks code cs.ai cs.cr data data poisoning developers easy github huggingface language large natural natural language poisoning poisoning attacks software target vulnerabilities writing

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Director, Cybersecurity - Governance, Risk and Compliance (GRC)

@ Stanley Black & Decker | New Britain CT USA - 1000 Stanley Dr

Information Security Risk Metrics Lead

@ Live Nation Entertainment | Work At Home-Connecticut

IT Product Owner - Enterprise DevSec Platform (d/f/m)

@ Airbus | Hamburg - Finkenwerder

Senior Information Security Specialist

@ Arthur Grand Technologies Inc | Arlington, VA, United States

Information Security Controls SME

@ Sword | Aberdeen, Scotland, United Kingdom