Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Reinforcement learning from simultaneous human and MDP reward.

W. Knox, and P. Stone. AAMAS, page 475-482. IFAAMAS, (2012)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Bradley Miller

William Bradley

Bradley Heins

Bradley Lushman

Bradley Malkovsky

Other publications of authors with the same name

Combining manual feedback with subsequent MDP reward signals for reinforcement learning.W. Knox, and P. Stone. AAMAS, page 5-12. IFAAMAS, (2010)Using informative behavior to increase engagement while learning from human reward.G. Li, S. Whiteson, W. Knox, and H. Hung. Auton. Agents Multi Agent Syst., 30 (5): 826-848 (2016)Reinforcement Learning with Human Feedback in Mountain Car.W. Knox, A. Setapen, and P. Stone. AAAI Spring Symposium: Help Me Help You: Bridging the Gaps in Human-Agent Collaboration, AAAI, (2011)Reward (Mis)design for autonomous driving.W. Knox, A. Allievi, H. Banzhaf, F. Schmitt, and P. Stone. Artif. Intell., (March 2023)Interactively shaping agents via human reinforcement: the TAMER framework.W. Knox, and P. Stone. K-CAP, page 9-16. ACM, (2009)Domestic Interaction on a Segway Base.W. Knox, J. Lee, and P. Stone. RoboCup, volume 5399 of Lecture Notes in Computer Science, page 519-531. Springer, (2008)Reinforcement learning from human reward: Discounting in episodic tasks.W. Knox, and P. Stone. RO-MAN, page 878-885. IEEE, (2012)Models of human preference for learning reward functions.W. Knox, S. Hatgis-Kessell, S. Booth, S. Niekum, P. Stone, and A. Allievi. CoRR, (2022)Contrastive Preference Learning: Learning from Human Feedback without RL.J. Hejna, R. Rafailov, H. Sikchi, C. Finn, S. Niekum, W. Knox, and D. Sadigh. CoRR, (2023)Person recognition on a Segway Robot: A video of UT Austin Villa Robocup@Home 2007 finals demonstration.W. Knox, J. Lee, and P. Stone. ICRA, page 1785-1786. IEEE, (2008)

BibSonomy

Disambiguation of "Knox, W. Bradley"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Reinforcement learning from simultaneous human and MDP reward.

Please choose a person to relate this publication to

Bradley Miller

William Bradley

Bradley Heins

Bradley Lushman

Bradley Malkovsky

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Knox, W. Bradley"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Reinforcement learning from simultaneous human and MDP reward.

Please choose a person to relate this publication to

Bradley Miller

William Bradley

Bradley Heins

Bradley Lushman

Bradley Malkovsky

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Reinforcement learning from simultaneous human and MDP reward.