Inproceedings,

Optimal Resource Allocation with Semi-Bandit Feedback

T. Lattimore, K. Crammer, and {. Szepesvári.
UAI, page 477--486. (2014)

Abstract

We study a sequential resource allocation problem involving a fixed number of recurring jobs. At each time-step the manager should distribute available resources among the jobs in order to maximise the expected number of completed jobs. Allocating more resources to a given job increases the probability that it completes, but with a cut-off. Specifically, we assume a linear model where the probability increases linearly until it equals one, after which allocating additional resources is wasteful. We assume the difficulty of each job is unknown and present the first algorithm for this problem and prove upper and lower bounds on its regret. Despite its apparent simplicity, the problem has a rich structure: we show that an appropriate optimistic algorithm can improve its learning speed dramatically beyond the results one normally expects for similar problems as the problem becomes resource-laden.

BibTeX key: LaCrSze14
entry type: inproceedings
booktitle: UAI
year: 2014
pages: 477--486
pdf: papers/lcs14mem-alloc.pdf
date-modified: 2015-12-02 01:33:33 +0000
date-added: 2014-07-17 20:16:41 -0700

BibSonomy

Optimal Resource Allocation with Semi-Bandit Feedback

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on