R. Pascanu, T. Mikolov, and Y. Bengio. (2012)cite arxiv:1211.5063Comment: Improved description of the exploding gradient problem and description and analysis of the vanishing gradient problem.
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. (2014)cite arxiv:1412.3555Comment: Presented in NIPS 2014 Deep Learning and Representation Learning Workshop.