@brazovayeye

Hierarchical Learning with Procedural Abstraction Mechanisms

. Department of Computer Science, The College of Arts and Sciences, University of Rochester, Rochester, NY 14627, USA, (February 1997)

Abstract

The emerging field called evolutionary computation (EC) consists of the design and analysis of probabilistic algorithms inspired by the principles of natural selection and variation. Genetic Programming (GP) is one subfield of EC that emphasizes (1) the capability to discover and exploit intrinsic characteristics of the problem domain and (2) the flexibility to adapt the shape and complexity of structures manipulated in order to fit the specificity of the application. GP appears to be a robust stochastic search technique that can be applied to complex design, control, and knowledge discovery applications. However, the computational load when applying it to non-trivial problems is considerable. The problem of understanding and controlling the GP technique is notoriously difficult. This dissertation describes theoretical and experimental work aimed at (1) understanding the characteristics and limitations of GP search, and (2) extending the capabilities of GP in order to cope with problems of increasing complexity. For the first challenge, we focus on the properties and the role of the problem representation and analyze the implicit biases when manipulating variable length structures represented as trees. We analyze uniform random sampling on the space of tree structures to offer insights into the role of procedural representations. We describe the dynamics of a useful structural property of representations called rooted tree schemata. This demonstrates the role played by the adaptive complexity of evolved structures. We also capture the dynamics of behaviors using the concept of population entropy. The solution to the second challenge relies on the observation that on non-trivial problems GP essentially assembles and adapts the bits and pieces of a huge monolithic model. We propose, instead, that the learning system provide abstraction mechanisms for adaptively creating and exploiting modularity and problem decomposition. We evaluate previous steps in this direction by looking at GP search biases and complexity measures of solutions, such as expanded and descriptional complexity, and we characterize the types of modular structures that would result in a minimum description length of solutions. Then we describe two GP extension approaches for creating and exploiting procedural abstractions in the form of subroutines in order to facilitate scale-up. The first extension, called Adaptive Representation (AR) is a heuristic modular learning approach based on the discovery of hierarchies of subroutines. The second extension, called Evolutionary Divide-and-Conquer (EDC) views the population as a pool of candidates for selecting a team that solve the problem. Both techniques extract simple or "natural" relationships and build modular representations to explain data. The techniques are brought to life in several increasingly complex algorithms. The effects of embedding procedural and data abstraction mechanisms in the learning process are evaluated from several perspectives, such as reuse of code or structure, automatic problem decomposition, generalization, and automatic discovery of features on several challenging problems. AR was successfully applied to the intelligent control of an agent in a dynamic and non-deterministic environment. Ideas are further extended for designing graphical models. EDC was applied to regression problems characterized by complex input spaces.

Links and resources

Tags