TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
P. Ojha, S. Thota, V. M, und M. Tahilianni. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), 4 (3/4):
01 - 18(November 2015)
Zusammenfassung
Temporally extended actions have been proved to enhance the performance of reinforcement learning
agents. The broader framework of ‘Options’ gives us a flexible way of representing such extended course of
action in Markov decision processes. In this work we try to adapt options framework to model an operating
system scheduler, which is expected not to allow processor stay idle if there is any process ready or waiting
for its execution. A process is allowed to utilize CPU resources for a fixed quantum of time (timeslice) and
subsequent context switch leads to considerable overhead. In this work we try to utilize the historical
performances of a scheduler and try to reduce the number of redundant context switches. We propose a
machine-learning module, based on temporally extended reinforcement-learning agent, to predict a better
performing timeslice. We measure the importance of states, in option framework, by evaluating the impact
of their absence and propose an algorithm to identify such checkpoint states. We present empirical
evaluation of our approach in a maze-world navigation and their implications on ädaptive timeslice
parameter" show efficient throughput time.
%0 Journal Article
%1 ojhatemporally
%A Ojha, Prakhar
%A Thota, Siddhartha R
%A M, Vani
%A Tahilianni, Mohit P
%D 2015
%J International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI)
%K Actions Extension Learning Machine Online Operating Options Preemption Reinforcement Scheduler System Temporal of
%N 3/4
%P 01 - 18
%T TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
%U https://aircconline.com/ijscai/V4N4/4415ijscai01.pdf
%V 4
%X Temporally extended actions have been proved to enhance the performance of reinforcement learning
agents. The broader framework of ‘Options’ gives us a flexible way of representing such extended course of
action in Markov decision processes. In this work we try to adapt options framework to model an operating
system scheduler, which is expected not to allow processor stay idle if there is any process ready or waiting
for its execution. A process is allowed to utilize CPU resources for a fixed quantum of time (timeslice) and
subsequent context switch leads to considerable overhead. In this work we try to utilize the historical
performances of a scheduler and try to reduce the number of redundant context switches. We propose a
machine-learning module, based on temporally extended reinforcement-learning agent, to predict a better
performing timeslice. We measure the importance of states, in option framework, by evaluating the impact
of their absence and propose an algorithm to identify such checkpoint states. We present empirical
evaluation of our approach in a maze-world navigation and their implications on ädaptive timeslice
parameter" show efficient throughput time.
@article{ojhatemporally,
abstract = {Temporally extended actions have been proved to enhance the performance of reinforcement learning
agents. The broader framework of ‘Options’ gives us a flexible way of representing such extended course of
action in Markov decision processes. In this work we try to adapt options framework to model an operating
system scheduler, which is expected not to allow processor stay idle if there is any process ready or waiting
for its execution. A process is allowed to utilize CPU resources for a fixed quantum of time (timeslice) and
subsequent context switch leads to considerable overhead. In this work we try to utilize the historical
performances of a scheduler and try to reduce the number of redundant context switches. We propose a
machine-learning module, based on temporally extended reinforcement-learning agent, to predict a better
performing timeslice. We measure the importance of states, in option framework, by evaluating the impact
of their absence and propose an algorithm to identify such checkpoint states. We present empirical
evaluation of our approach in a maze-world navigation and their implications on "adaptive timeslice
parameter" show efficient throughput time.},
added-at = {2022-02-01T08:35:19.000+0100},
author = {Ojha, Prakhar and Thota, Siddhartha R and M, Vani and Tahilianni, Mohit P},
biburl = {https://www.bibsonomy.org/bibtex/2dc4b51b7df83dfc98b70f2be02c9d84d/leninsha},
interhash = {b37db1879e7dbd86c3b81b0f0ebc2005},
intrahash = {dc4b51b7df83dfc98b70f2be02c9d84d},
journal = {International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI)},
keywords = {Actions Extension Learning Machine Online Operating Options Preemption Reinforcement Scheduler System Temporal of},
month = {November},
number = {3/4},
pages = {01 - 18},
timestamp = {2022-02-01T08:35:19.000+0100},
title = {TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
},
url = {https://aircconline.com/ijscai/V4N4/4415ijscai01.pdf},
volume = 4,
year = 2015
}