Generalisation of the limiting distribution of program
sizes in tree-based genetic programming and analysis of
its effects on bloat
S. Dignum, and R. Poli. GECCO '07: Proceedings of the 9th annual conference on
Genetic and evolutionary computation, 2, page 1588--1595. London, ACM Press, (7-11 July 2007)
Abstract
Recent research 1 has found that standard sub-tree
crossover with uniform selection of crossover points,
in the absence of fitness pressure, pushes a population
of GP trees towards a Lagrange distribution of tree
sizes. However, the result applied to the case of
single arity function plus leaf node combinations,
e.g., unary, binary, ternary, etc trees only. In this
paper we extend those findings and show that the same
distribution is also applicable to the more general
case where the function set includes functions of mixed
arities. We also provide empirical evidence that
strongly corroborates this generalisation. Both
predicted and observed results show a distinct bias
towards the sampling of shorter programs irrespective
of the mix of function arities used. Practical
applications and implications of this knowledge are
investigated with regard to search efficiency and
program bloat. Work is also presented regarding the
applicability of the theory to the traditional
0.90-function 0.10-terminal crossover node selection
policy.
GECCO '07: Proceedings of the 9th annual conference on
Genetic and evolutionary computation
year
2007
month
7-11 July
pages
1588--1595
publisher
ACM Press
volume
2
organisation
ACM SIGEVO (formerly ISGEC)
publisher_address
New York, NY, USA
isbn13
978-1-59593-697-4
notes
GECCO-2007 A joint meeting of the sixteenth
international conference on genetic algorithms
(ICGA-2007) and the twelfth annual genetic programming
conference (GP-2007).
ACM Order Number 910071
%0 Conference Paper
%1 1277277
%A Dignum, Stephen
%A Poli, Riccardo
%B GECCO '07: Proceedings of the 9th annual conference on
Genetic and evolutionary computation
%C London
%D 2007
%E Thierens, Dirk
%E Beyer, Hans-Georg
%E Bongard, Josh
%E Branke, Jurgen
%E Clark, John Andrew
%E Cliff, Dave
%E Congdon, Clare Bates
%E Deb, Kalyanmoy
%E Doerr, Benjamin
%E Kovacs, Tim
%E Kumar, Sanjeev
%E Miller, Julian F.
%E Moore, Jason
%E Neumann, Frank
%E Pelikan, Martin
%E Poli, Riccardo
%E Sastry, Kumara
%E Stanley, Kenneth Owen
%E Stutzle, Thomas
%E Watson, Richard A
%E Wegener, Ingo
%I ACM Press
%K Bias, Size algorithms, bloat, crossover distribution genetic initialisation, program programming,
%P 1588--1595
%T Generalisation of the limiting distribution of program
sizes in tree-based genetic programming and analysis of
its effects on bloat
%U http://doi.acm.org/10.1145/1276958.1277277
%V 2
%X Recent research 1 has found that standard sub-tree
crossover with uniform selection of crossover points,
in the absence of fitness pressure, pushes a population
of GP trees towards a Lagrange distribution of tree
sizes. However, the result applied to the case of
single arity function plus leaf node combinations,
e.g., unary, binary, ternary, etc trees only. In this
paper we extend those findings and show that the same
distribution is also applicable to the more general
case where the function set includes functions of mixed
arities. We also provide empirical evidence that
strongly corroborates this generalisation. Both
predicted and observed results show a distinct bias
towards the sampling of shorter programs irrespective
of the mix of function arities used. Practical
applications and implications of this knowledge are
investigated with regard to search efficiency and
program bloat. Work is also presented regarding the
applicability of the theory to the traditional
0.90-function 0.10-terminal crossover node selection
policy.
@inproceedings{1277277,
abstract = {Recent research [1] has found that standard sub-tree
crossover with uniform selection of crossover points,
in the absence of fitness pressure, pushes a population
of GP trees towards a Lagrange distribution of tree
sizes. However, the result applied to the case of
single arity function plus leaf node combinations,
e.g., unary, binary, ternary, etc trees only. In this
paper we extend those findings and show that the same
distribution is also applicable to the more general
case where the function set includes functions of mixed
arities. We also provide empirical evidence that
strongly corroborates this generalisation. Both
predicted and observed results show a distinct bias
towards the sampling of shorter programs irrespective
of the mix of function arities used. Practical
applications and implications of this knowledge are
investigated with regard to search efficiency and
program bloat. Work is also presented regarding the
applicability of the theory to the traditional
0.90-function 0.10-terminal crossover node selection
policy.},
added-at = {2008-06-19T17:35:00.000+0200},
address = {London},
author = {Dignum, Stephen and Poli, Riccardo},
biburl = {https://www.bibsonomy.org/bibtex/267483eb4869e4595c113b786b6c07950/brazovayeye},
booktitle = {GECCO '07: Proceedings of the 9th annual conference on
Genetic and evolutionary computation},
editor = {Thierens, Dirk and Beyer, Hans-Georg and Bongard, Josh and Branke, Jurgen and Clark, John Andrew and Cliff, Dave and Congdon, Clare Bates and Deb, Kalyanmoy and Doerr, Benjamin and Kovacs, Tim and Kumar, Sanjeev and Miller, Julian F. and Moore, Jason and Neumann, Frank and Pelikan, Martin and Poli, Riccardo and Sastry, Kumara and Stanley, Kenneth Owen and Stutzle, Thomas and Watson, Richard A and Wegener, Ingo},
interhash = {ec3c2ffd4b42a78702ebcc90fb6ce9dd},
intrahash = {67483eb4869e4595c113b786b6c07950},
isbn13 = {978-1-59593-697-4},
keywords = {Bias, Size algorithms, bloat, crossover distribution genetic initialisation, program programming,},
month = {7-11 July},
notes = {GECCO-2007 A joint meeting of the sixteenth
international conference on genetic algorithms
(ICGA-2007) and the twelfth annual genetic programming
conference (GP-2007).
ACM Order Number 910071},
organisation = {ACM SIGEVO (formerly ISGEC)},
pages = {1588--1595},
publisher = {ACM Press},
publisher_address = {New York, NY, USA},
timestamp = {2008-06-19T17:38:42.000+0200},
title = {Generalisation of the limiting distribution of program
sizes in tree-based genetic programming and analysis of
its effects on bloat},
url = {http://doi.acm.org/10.1145/1276958.1277277},
volume = 2,
year = 2007
}