копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Combinatorial Cascading Bandits

B. Kveton, Z. Wen, A. Ashkan, и {. Szepesvári. NIPS, стр. 1450--1458. (сентября 2015)

Аннотация

We consider learning to maximize reward in combinatorial cascading bandits, a new learning setting that unifies cascading and combinatorial bandits. The unification of these frameworks presents unique challenges in the analysis but allows for modeling a rich set of partial monitoring problems, such as learning to route in a communication network to minimize the probability of losing routed packets and recommending diverse items. We propose CombCascade, a computationally-efficient UCB-like algorithm for solving our problem; and derive gap-dependent and gap-free upper bounds on its regret. Our analysis builds on recent results in stochastic combinatorial semi-bandits but also addresses two novel challenges of our learning setting, a non-linear objective and partial observability. We evaluate CombCascade on two real-world problems and demonstrate that it performs well even when our modeling assumptions are violated. We also demonstrate that our setting requires new learning algorithms.

Линки и ресурсы

ключ BibTeX: KveWeAshSze15
тип записи: inproceedings
название книги: NIPS
год: 2015
месяц: September
страницы: 1450--1458
pdf: papers/NIPS15-CombCascadeBandit.pdf
date-modified: 2016-08-01 03:14:33 +0000
date-added: 2015-12-02 01:22:43 +0000

тэги

Цитировать эту публикацию

искать в

Метаданные

Последнее изменение 5 лет назад
Создан 5 лет назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!