Full-Jacobian Representation of Neural Networks

Abstract

Non-linear functions such as neural networks can be locally approximated by affine planes. Recent works make use of input-Jacobians, which describe the normal to these planes. In this paper, we introduce full-Jacobians, which includes this normal along with an additional intercept term called the bias-Jacobians, that together completely describe local planes. For ReLU neural networks, bias-Jacobians correspond to sums of gradients of outputs w.r.t. intermediate layer activations. We first use these full-Jacobians for distillation by aligning gradients of their intermediate representations. Next, we regularize bias-Jacobians alone to improve generalization. Finally, we show that full-Jacobian maps can be viewed as saliency maps. Experimental results show improved distillation on small data-sets, improved generalization for neural network training, and sharper saliency maps.

BibTeX key: srinivas2019fulljacobian
entry type: misc
year: 2019
url: http://arxiv.org/abs/1905.00780
note: cite arxiv:1905.00780

BibSonomy

Full-Jacobian Representation of Neural Networks

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on