Feb. 7, 2024, 5:10 a.m. | Ossi R\"ais\"a Joonas J\"alk\"o Antti Honkela

cs.CR updates on arXiv.org arxiv.org

We study the effect of the batch size to the total gradient variance in differentially private stochastic gradient descent (DP-SGD), seeking a theoretical explanation for the usefulness of large batch sizes. As DP-SGD is the basis of modern DP deep learning, its properties have been widely studied, and recent works have empirically found large batch sizes to be beneficial. However, theoretical explanations of this benefit are currently heuristic at best. We first observe that the total gradient variance in DP-SGD …

batch cs.cr cs.lg deep learning effect large magic private size stat.ml study work

Director of IT & Information Security

@ Outside | Boulder, CO

Information Security Governance Manager

@ Informa Group Plc. | London, United Kingdom

Senior Risk Analyst - Application Security (Remote, United States)

@ Dynatrace | Waltham, MA, United States

Security Software Engineer (Starshield) - Top Secret Clearance

@ SpaceX | Washington, DC

Network & Security Specialist (IT24055)

@ TMEIC | Roanoke, Virginia, United States

Senior Security Engineer - Application Security (F/M/N)

@ Swile | Paris, France