all InfoSec news
Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge. (arXiv:2302.07735v1 [cs.CL])
cs.CR updates on arXiv.org arxiv.org
Previous work has shown that Large Language Models are susceptible to
so-called data extraction attacks. This allows an attacker to extract a sample
that was contained in the training data, which has massive privacy
implications. The construction of data extraction attacks is challenging,
current attacks are quite inefficient, and there exists a significant gap in
the extraction capabilities of untargeted attacks and memorization. Thus,
targeted attacks are proposed, which identify if a given sample from the
training data, is extractable …
attack attacks called capabilities challenge construction current data extract gap gpt language language models large privacy targeted attack targeted attacks training work