Author of the publication

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm.

, , , , , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 33657-33673. PMLR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Theoretical Analysis of Efficiency and Robustness of Softmax and Gap-Increasing Operators in Reinforcement Learning., , and . AISTATS, volume 89 of Proceedings of Machine Learning Research, page 2995-3003. PMLR, (2019)Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control with Action Constraints., , , , , and . CoRR, (2023)Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning., , and . CoRR, (2019)Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences., , , , , and . CoRR, (2021)Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming., , and . CoRR, (2017)No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL., , , , , , , , , and . CoRR, (2022)Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice., , , , , , , , , and 5 other author(s). ICML, volume 202 of Proceedings of Machine Learning Research, page 17135-17175. PMLR, (2023)Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs., , , and . NeurIPS, (2022)Leverage the Average: an Analysis of Regularization in RL., , , , , and . CoRR, (2020)Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences., , , , , and . J. Mach. Learn. Res., (2022)