Feb. 7, 2024, 5:10 a.m. | Ossi R\"ais\"a Joonas J\"alk\"o Antti Honkela

cs.CR updates on arXiv.org arxiv.org

We study the effect of the batch size to the total gradient variance in differentially private stochastic gradient descent (DP-SGD), seeking a theoretical explanation for the usefulness of large batch sizes. As DP-SGD is the basis of modern DP deep learning, its properties have been widely studied, and recent works have empirically found large batch sizes to be beneficial. However, theoretical explanations of this benefit are currently heuristic at best. We first observe that the total gradient variance in DP-SGD …

batch cs.cr cs.lg deep learning effect large magic private size stat.ml study work

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Network Security Engineer – Zscaler SME

@ Peraton | United States

Splunk Data Analytic Subject Matter Expert

@ Peraton | Woodlawn, MD, United States

Principal Consultant, Offensive Security, Proactive Services (Unit 42)- Remote

@ Palo Alto Networks | Santa Clara, CA, United States

Senior Engineer Software Product Security

@ Ford Motor Company | Mexico City, MEX, Mexico

Information System Security Engineer (Red Team)

@ Evolution | Riga, Latvia