Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.

@tgandor 6 years ago (last updated 6 years ago)
Very readable, explaining both why and how BN works. Great build-up from the basics (activations, vanishing gradient, Internal Covariate Shift). It's no wonder, that Keras documentation links to it.
References
Bookmarks
deleting review

Please log in to take part in the discussion (add own reviews or comments).

BibSonomy