all InfoSec news
Membership Inference Attacks against Language Models via Neighbourhood Comparison. (arXiv:2305.18462v1 [cs.CL])
cs.CR updates on arXiv.org arxiv.org
Membership Inference attacks (MIAs) aim to predict whether a data sample was
present in the training data of a machine learning model or not, and are widely
used for assessing the privacy risks of language models. Most existing attacks
rely on the observation that models tend to assign higher probabilities to
their training samples than non-training points. However, simple thresholding
of the model score in isolation tends to lead to high false-positive rates as
it does not account for the …
aim attacks data higher language language models machine machine learning predict privacy privacy risks risks training