@brazovayeye

A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset

, , , , , , , , , and . Genomics, 85 (1): 16--23 (January 2005)
DOI: doi:10.1016/j.ygeno.2004.09.007

Abstract

Development of a robust and efficient approach for extracting useful information from microarray data continues to be a significant and challenging task. Microarray data are characterised by a high dimension, high signal-to-noise ratio, and high correlations between genes, but with a relatively small sample size. Current methods for dimensional reduction can further be improved for the scenario of the presence of a single (or a few) high influential gene(s) in which its effect in the feature subset would prohibit inclusion of other important genes. We have formalised a robust gene selection approach based on a hybrid between genetic algorithm and support vector machine. The major goal of this hybridisation was to exploit fully their respective merits (e.g., robustness to the size of solution space and capability of handling a very large dimension of feature genes) for identification of key feature genes (or molecular signatures) for a complex biological phenotype. We have applied the approach to the microarray data of diffuse large B cell lymphoma to demonstrate its behaviours and properties for mining the high-dimension data of genome-wide gene expression profiles. The resulting classifier(s) (the optimal gene subset(s)) has achieved the highest accuracy (99percent) for prediction of independent microarray samples in comparisons with marginal filters and a hybrid between genetic algorithm and K nearest neighbours.

Links and resources

Tags