From post

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration.

, , и . AISTATS, том 108 из Proceedings of Machine Learning Research, стр. 1188-1199. PMLR, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Feature Selection via Mutual Information: New Theoretical Insights., , , , и . IJCNN, стр. 1-9. IEEE, (2019)Offline Primal-Dual Reinforcement Learning for Linear MDPs., , , и . AISTATS, том 238 из Proceedings of Machine Learning Research, стр. 3169-3177. PMLR, (2024)Safe policy optimization.. Polytechnic University of Milan, Italy, (2021)Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration., , и . AISTATS, том 108 из Proceedings of Machine Learning Research, стр. 1188-1199. PMLR, (2020)Automated Reasoning for Reinforcement Learning Agents in Structured Environments., , и . OVERLAY@GandALF, том 2987 из CEUR Workshop Proceedings, стр. 43-48. CEUR-WS.org, (2021)Smoothing Policies and Safe Policy Gradients., , и . CoRR, (2019)Risk-Averse Trust Region Optimization for Reward-Volatility Reduction., , , , и . CoRR, (2019)Gradient-Aware Model-Based Policy Search., , , , и . AAAI, стр. 3801-3808. AAAI Press, (2020)Optimistic Policy Optimization via Multiple Importance Sampling., , , и . ICML, том 97 из Proceedings of Machine Learning Research, стр. 4989-4999. PMLR, (2019)No-Regret Reinforcement Learning in Smooth MDPs., , , и . ICML, OpenReview.net, (2024)