@brazovayeye

Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat

, and . GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, 2, page 1588--1595. London, ACM Press, (7-11 July 2007)

Abstract

Recent research 1 has found that standard sub-tree crossover with uniform selection of crossover points, in the absence of fitness pressure, pushes a population of GP trees towards a Lagrange distribution of tree sizes. However, the result applied to the case of single arity function plus leaf node combinations, e.g., unary, binary, ternary, etc trees only. In this paper we extend those findings and show that the same distribution is also applicable to the more general case where the function set includes functions of mixed arities. We also provide empirical evidence that strongly corroborates this generalisation. Both predicted and observed results show a distinct bias towards the sampling of shorter programs irrespective of the mix of function arities used. Practical applications and implications of this knowledge are investigated with regard to search efficiency and program bloat. Work is also presented regarding the applicability of the theory to the traditional 0.90-function 0.10-terminal crossover node selection policy.

Links and resources

Tags

community