From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

World Discovery Models., , , , , и . CoRR, (2019)Minimax Regret Bounds for Reinforcement Learning., , и . ICML, том 70 из Proceedings of Machine Learning Research, стр. 263-272. PMLR, (2017)Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control., , и . ECML/PKDD (2), том 8725 из Lecture Notes in Computer Science, стр. 66-81. Springer, (2014)Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems., и . IJCAI, стр. 1348-1355. Morgan Kaufmann, (1999)Sample Efficient Actor-Critic with Experience Replay., , , , , , и . ICLR (Poster), OpenReview.net, (2017)Combining policy gradient and Q-learning., , , и . ICLR (Poster), OpenReview.net, (2017)Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation.. J. Mach. Learn. Res., (2006)Sensitivity Analysis Using Ito-circumflex--Malliavin Calculus and Martingales, and Application to Stochastic Optimal Control., и . SIAM J. Control and Optimization, 43 (5): 1676-1713 (2005)The Uncertainty Bellman Equation and Exploration., , , и . CoRR, (2017)PGQ: Combining policy gradient and Q-learning., , , и . CoRR, (2016)