all InfoSec news
Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference. (arXiv:2304.13274v1 [cs.LG])
cs.CR updates on arXiv.org arxiv.org
Large number of ReLU and MAC operations of Deep neural networks make them
ill-suited for latency and compute-efficient private inference. In this paper,
we present a model optimization method that allows a model to learn to be
shallow. In particular, we leverage the ReLU sensitivity of a convolutional
block to remove a ReLU layer and merge its succeeding and preceding convolution
layers to a shallow block. Unlike existing ReLU reduction methods, our joint
reduction method can yield models with improved …
block compute large latency learn mac making merge networks neural networks non operations optimization private remove