Artikel,

Model-Based Reinforcement Learning for Whole-Chain Recommendations

X. Zhao, L. Xia, Y. Zhao, D. Yin, und J. Tang.
(2019)cite arxiv:1902.03987.

Zusammenfassung

With the recent prevalence of Reinforcement Learning (RL), there have been tremendous interests in developing RL-based recommender systems. In practical recommendation sessions, users will sequentially access multiple scenarios, such as the entrance pages and the item detail pages, and each scenario has its own recommendation strategy. However, the majority of existing RL-based recommender systems focus on separately optimizing each strategy, which could lead to sub-optimal overall performance, because independently optimizing each scenario (i) overlooks the sequential correlation among scenarios, (ii) ignores users' behavior data from other scenarios, and (iii) only optimizes its own objective but neglects the overall objective of a session. Therefore, in this paper, we study the recommendation problem with multiple (consecutive) scenarios, i.e., whole-chain recommendations. We propose a multi-agent reinforcement learning based approach (DeepChain), which can capture the sequential correlation among different scenarios and jointly optimize multiple recommendation strategies. To be specific, all recommender agents share the same memory of users' historical behaviors, and they work collaboratively to maximize the overall reward of a session. Note that optimizing multiple recommendation strategies jointly faces two challenges - (i) it requires huge amounts of user behavior data, and (ii) the distribution of reward (users' feedback) are extremely unbalanced. In this paper, we introduce model-based reinforcement learning techniques to reduce the training data requirement and execute more accurate strategy updates. The experimental results based on data from a real e-commerce platform demonstrate the effectiveness of the proposed framework. Further experiments have been conducted to validate the importance of each component of DeepChain.

BibTeX-Schlüssel: zhao2019ChainRecommendations
Eintragstyp: article
Jahr: 2019
URL: http://arxiv.org/abs/1902.03987
Hinweis: cite arxiv:1902.03987

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

%0 Journal Article %1 zhao2019ChainRecommendations %A Zhao, Xiangyu %A Xia, Long %A Zhao, Yihong %A Yin, Dawei %A Tang, Jiliang %D 2019 %K %T Model-Based Reinforcement Learning for Whole-Chain Recommendations %U http://arxiv.org/abs/1902.03987 %X With the recent prevalence of Reinforcement Learning (RL), there have been tremendous interests in developing RL-based recommender systems. In practical recommendation sessions, users will sequentially access multiple scenarios, such as the entrance pages and the item detail pages, and each scenario has its own recommendation strategy. However, the majority of existing RL-based recommender systems focus on separately optimizing each strategy, which could lead to sub-optimal overall performance, because independently optimizing each scenario (i) overlooks the sequential correlation among scenarios, (ii) ignores users' behavior data from other scenarios, and (iii) only optimizes its own objective but neglects the overall objective of a session. Therefore, in this paper, we study the recommendation problem with multiple (consecutive) scenarios, i.e., whole-chain recommendations. We propose a multi-agent reinforcement learning based approach (DeepChain), which can capture the sequential correlation among different scenarios and jointly optimize multiple recommendation strategies. To be specific, all recommender agents share the same memory of users' historical behaviors, and they work collaboratively to maximize the overall reward of a session. Note that optimizing multiple recommendation strategies jointly faces two challenges - (i) it requires huge amounts of user behavior data, and (ii) the distribution of reward (users' feedback) are extremely unbalanced. In this paper, we introduce model-based reinforcement learning techniques to reduce the training data requirement and execute more accurate strategy updates. The experimental results based on data from a real e-commerce platform demonstrate the effectiveness of the proposed framework. Further experiments have been conducted to validate the importance of each component of DeepChain.

@article{zhao2019ChainRecommendations, abstract = {With the recent prevalence of Reinforcement Learning (RL), there have been tremendous interests in developing RL-based recommender systems. In practical recommendation sessions, users will sequentially access multiple scenarios, such as the entrance pages and the item detail pages, and each scenario has its own recommendation strategy. However, the majority of existing RL-based recommender systems focus on separately optimizing each strategy, which could lead to sub-optimal overall performance, because independently optimizing each scenario (i) overlooks the sequential correlation among scenarios, (ii) ignores users' behavior data from other scenarios, and (iii) only optimizes its own objective but neglects the overall objective of a session. Therefore, in this paper, we study the recommendation problem with multiple (consecutive) scenarios, i.e., whole-chain recommendations. We propose a multi-agent reinforcement learning based approach (DeepChain), which can capture the sequential correlation among different scenarios and jointly optimize multiple recommendation strategies. To be specific, all recommender agents share the same memory of users' historical behaviors, and they work collaboratively to maximize the overall reward of a session. Note that optimizing multiple recommendation strategies jointly faces two challenges - (i) it requires huge amounts of user behavior data, and (ii) the distribution of reward (users' feedback) are extremely unbalanced. In this paper, we introduce model-based reinforcement learning techniques to reduce the training data requirement and execute more accurate strategy updates. The experimental results based on data from a real e-commerce platform demonstrate the effectiveness of the proposed framework. Further experiments have been conducted to validate the importance of each component of DeepChain.}, added-at = {2019-07-13T10:17:27.000+0200}, author = {Zhao, Xiangyu and Xia, Long and Zhao, Yihong and Yin, Dawei and Tang, Jiliang}, biburl = {https://www.bibsonomy.org/bibtex/2a8688f6b15dd7790db3c2e5cdd7969f3/lanteunis}, interhash = {e88b3029f1f174e033f57634081513ec}, intrahash = {a8688f6b15dd7790db3c2e5cdd7969f3}, keywords = {}, note = {cite arxiv:1902.03987}, timestamp = {2019-07-13T10:17:27.000+0200}, title = {Model-Based Reinforcement Learning for Whole-Chain Recommendations}, url = {http://arxiv.org/abs/1902.03987}, year = 2019 }

BibSonomy

Model-Based Reinforcement Learning for Whole-Chain Recommendations

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf