Incollection,

Classification of Gene Expression Data with Genetic Programming

, , and .
Genetic Programming Theory and Practice, chapter 3, Kluwer, (2003)

Abstract

This paper summarises the use of a genetic programming (GP) system to develop classification rules for gene expression data that hold promise for the development of new molecular diagnostics. This work focuses on discovering simple, accurate rules that diagnose diseases based on changes of gene expression profiles within a diseased cell. GP is shown to be a useful technique for discovering classification rules in a supervised learning mode where the biological genotype is paired with a biological phenotype such as a disease state. In the process of developing these rules it is necessary to develop new techniques for establishing fitness and interpreting the results of evolutionary runs because of the large number of independent variables and the comparatively small number of samples. These techniques are described and issues of overfitting caused by small sample sizes and the behaviour of the GP system when variables are missing from the samples are discussed.

Tags

Users

  • @brazovayeye

Comments and Reviews