Abstract
This paper presents a way to estimate the difficulty
and discriminating power of any task instance. We focus
on a very general setting for tasks: interactive
(possibly multi-agent) environments where an agent acts
upon observations and rewards. Instead of analysing the
complexity of the environment, the state space or the
actions that are performed by the agent, we analyse the
performance of a population of agent policies against
the task, leading to a distribution that is examined in
terms of policy complexity. This distribution is then
sliced by the algorithmic complexity of the policy and
analysed through several diagrams and indicators. The
notion of environment response curve is also
introduced, by inverting the performance results into
an ability scale. We apply all these concepts, diagrams
and indicators to two illustrative problems: a class of
agent-populated elementary cellular automata, showing
how the difficulty and discriminating power may vary
for several environments, and a multi-agent system,
where agents can become predators or preys, and may
need to coordinate. Finally, we discuss how these tools
can be applied to characterise (interactive) tasks and
(multi-agent) environments. These characterisations can
then be used to get more insight about agent
performance and to facilitate the development of
adaptive tests for the evaluation of agent abilities.
Users
Please
log in to take part in the discussion (add own reviews or comments).