This article aims to provide a concise yet comprehensive introduction to one of the most important class of control algorithms in Reinforcement Learning - Policy Gradients. I will discuss these…
D. Dong, H. Wu, W. He, D. Yu, and H. Wang. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1, page 1723--1732. The Association for Computer Linguistics, (2015)
R. Pascanu, T. Mikolov, and Y. Bengio. (2012)cite arxiv:1211.5063Comment: Improved description of the exploding gradient problem and description and analysis of the vanishing gradient problem.
K. Hashimoto, C. Xiong, Y. Tsuruoka, and R. Socher. (2016)cite arxiv:1611.01587Comment: Accepted as a full paper at the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017).