Abstract

We present new numerical algorithms and bounds for the infinite horizon, discrete stage, finite state and action Markov decision process with imprecise transition probabilities. We assume that the transition probability mass vector for each state and action is described by a finite number of linear inequalities. This model of imprecision appears to be well suited for describing statistically determined confidence limits and/or natural language statements of likelihood. The numerical procedures for calculating an optimal max-mm strategy are based on successive approximations, reward revision, and modified policy iteration. The bounds that are determined are at least as tight as currently available bounds for the case where the transition probabilities are precise.

Description

diverse cognitive systems bib

Links and resources

Tags

community