June 5, 2023, 1:10 a.m. | Alexey Kurakin, Natalia Ponomareva, Umar Syed, Liam MacDermed, Andreas Terzis

cs.CR updates on arXiv.org arxiv.org

Differentially private (DP) training methods like DP-SGD can protect
sensitive training data by ensuring that ML models will not reveal private
information. An alternative approach, which this paper studies, is to use a
sensitive dataset to generate a new synthetic dataset which is differentially
private with respect to the original data. Doing so has several advantages:
synthetic data can be reused for other tasks (including for hyper parameter
tuning), retained indefinitely, or shared with third parties without
sacrificing privacy.


However, …

data information language language models large ml models private protect respect studies synthetic text training

Cyber Security Network Engineer

@ Nine | North Sydney, Australia

Professional, IAM Security

@ Ingram Micro | Manila Shared Services Center

Principal Windows Threat & Detection Security Researcher (Cortex)

@ Palo Alto Networks | Tel Aviv-Yafo, Israel

Security Engineer - IT Infra Security Architecture

@ Coupang | Seoul, South Korea

Senior Security Engineer

@ LiquidX | Singapore, Central Singapore, Singapore

Application Security Engineer

@ Solidigm | Zapopan, Mexico