all InfoSec news
Amplifying Training Data Exposure through Fine-Tuning with Pseudo-Labeled Memberships
Feb. 20, 2024, 5:11 a.m. | Myung Gyo Oh, Hong Eun Ahn, Leo Hyun Park, Taekyoung Kwon
cs.CR updates on arXiv.org arxiv.org
Abstract: Neural language models (LMs) are vulnerable to training data extraction attacks due to data memorization. This paper introduces a novel attack scenario wherein an attacker adversarially fine-tunes pre-trained LMs to amplify the exposure of the original training data. This strategy differs from prior studies by aiming to intensify the LM's retention of its pre-training dataset. To achieve this, the attacker needs to collect generated texts that are closely aligned with the pre-training data. However, without …
amplify arxiv attack attacker attacks cs.cl cs.cr cs.lg data data exposure exposure extraction fine-tuning language language models lms novel scenario strategy studies training training data vulnerable
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Network Security Analyst
@ Wiz | Tel Aviv
Penetration Testing Staff Engineer- Turkey Remote
@ SonicWall | Istanbul, Istanbul, Türkiye
Physical Security Engineer
@ Microsoft | Atlanta, Georgia, United States
Junior Security Consultant (m/w/d)
@ Deutsche Telekom | Berlin, Deutschland
Senior Cybersecurity Product Specialist - Security Endpoint Protection
@ Pacific Gas and Electric Company | San Ramon, CA, US, 94583
Security Engineer, Pre-Sales (PA/NJ)
@ Vectra | US - South New Jersey, US - Pennsylvania