July 7, 2022, 1:20 a.m. | Ryuichi Ito, Seng Pei Liew, Tsubasa Takahashi, Yuya Sasaki, Makoto Onizuka

cs.CR updates on arXiv.org arxiv.org

Applying Differentially Private Stochastic Gradient Descent (DPSGD) to
training modern, large-scale neural networks such as transformer-based models
is a challenging task, as the magnitude of noise added to the gradients at each
iteration scales with model dimension, hindering the learning capability
significantly. We propose a unified framework, $\textsf{LSG}$, that fully
exploits the low-rank and sparse structure of neural networks to reduce the
dimension of gradient updates, and hence alleviate the negative impacts of
DPSGD. The gradient updates are first approximated …

deep learning lg scaling

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Information Security Engineers

@ D. E. Shaw Research | New York City

Dir-Information Security - Cyber Analytics

@ Marriott International | Bethesda, MD, United States

Security Engineer - Security Operations

@ TravelPerk | Barcelona, Barcelona, Spain

Information Security Mgmt- Risk Assessor

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

SAP CO Consultant

@ Atos | Istanbul, TR