Feb. 20, 2024, 5:11 a.m. | Myung Gyo Oh, Hong Eun Ahn, Leo Hyun Park, Taekyoung Kwon

cs.CR updates on arXiv.org arxiv.org

arXiv:2402.12189v1 Announce Type: cross
Abstract: Neural language models (LMs) are vulnerable to training data extraction attacks due to data memorization. This paper introduces a novel attack scenario wherein an attacker adversarially fine-tunes pre-trained LMs to amplify the exposure of the original training data. This strategy differs from prior studies by aiming to intensify the LM's retention of its pre-training dataset. To achieve this, the attacker needs to collect generated texts that are closely aligned with the pre-training data. However, without …

amplify arxiv attack attacker attacks cs.cl cs.cr cs.lg data data exposure exposure extraction fine-tuning language language models lms novel scenario strategy studies training training data vulnerable

Information Security Engineers

@ D. E. Shaw Research | New York City

Technology Security Analyst

@ Halton Region | Oakville, Ontario, Canada

Senior Cyber Security Analyst

@ Valley Water | San Jose, CA

Security Operations Manager-West Coast

@ The Walt Disney Company | USA - CA - 2500 Broadway Street

Vulnerability Analyst - Remote (WFH)

@ Cognitive Medical Systems | Phoenix, AZ, US | Oak Ridge, TN, US | Austin, TX, US | Oregon, US | Austin, TX, US

Senior Mainframe Security Administrator

@ Danske Bank | Copenhagen V, Denmark