Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

X-Armed Bandits

S. Bubeck, R. Munos, G. Stoltz, and {. Szepesvári. Journal of Machine Learning Research, (June 2011)Submitted on 21/1/2010.

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

R Chaudhury

R Bässler

R Giese

R Sailer

R Oppelt

Other publications of authors with the same name

X-Armed BanditsS. Bubeck, R. Munos, G. Stoltz, and {. Szepesvári. Journal of Machine Learning Research, (June 2011)Submitted on 21/1/2010.Tuning Bandit Algorithms in Stochastic EnvironmentsJ. Audibert, R. Munos, and {. Szepesvári. ALT, page 150--165. Springer, (2007)See audibert2009 for a longer, updated version.Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample pathA. Antos, {. Szepesvári, and R. Munos. Machine Learning, 71 (1): 89--129 (April 2008)Published Online First: 14 Nov, 2007.Online Optimization in X-armed BanditsS. Bubeck, R. Munos, G. Stoltz, and {. Szepesvári. NIPS, page 201--208. MIT Press, (2008)Finite Time Bounds for Fitted Value IterationR. Munos, and {. Szepesvári. JMLR, (2008)Fitted Q-iteration in Continuous Action-space MDPsA. Antos, R. Munos, and {. Szepesvári. NIPS, page 9--16. (2007)Finite Time Bounds for Sampling Based Fitted Value IterationR. Munos, and {. Szepesvári. ICML, page 881---886. (2005)Value-iteration Based Fitted Policy Iteration: Learning with a Single TrajectoryA. Antos, {. Szepesvári, and R. Munos. 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL 2007), page 330--337. IEEE, (2007)(Honolulu, Hawaii, Apr 1--5, 2007.).Reinforcement Learning for Continuous Stochastic Control ProblemsR. Munos, and P. Bourgine. Advances in Neural Information Processing Systems - 10, page 1029--1035. MIT Press, (1998)Influence and Variance of a Markov Chain : Application to Adaptive Discretization in Optimal ControlR. Munos, and A. Moore. Proceedings of the 38th IEEE Conference on Decision and Control (CDC-99), 2, page 1464 -- 1469. (December 1999)

BibSonomy

Disambiguation of "Munos, R."

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

X-Armed Bandits

Please choose a person to relate this publication to

R Chaudhury

R Bässler

R Giese

R Sailer

R Oppelt

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Munos, R."

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML X-Armed Bandits

Please choose a person to relate this publication to

R Chaudhury

R Bässler

R Giese

R Sailer

R Oppelt

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

X-Armed Bandits