Abstract
We have developed a statistical regression modeling approach to discover
genes that are differentially expressed between two predefined sample
groups in DNA microarray experiments. Our model is based on well-defined
assumptions, uses rigorous and well-characterized statistical measures,
and accounts for the heterogeneity and genomic complexity of the
data. In contrast to cluster analysis, which attempts to define groups
of genes and/or samples that share common overall expression profiles,
our modeling approach uses known sample group membership to focus
on expression profiles of individual genes in a sensitive and robust
manner. Further, this approach can be used to test statistical hypotheses
about gene expression. To demonstrate this methodology, we compared
the expression profiles of 11 acute myeloid leukemia (AML) and 27
acute lymphoblastic leukemia (ALL) samples from a previous study
(Golub et al. 1999) and found 141 genes differentially expressed
between AML and ALL with a 1% significance at the genomic level.
Using this modeling approach to compare different sample groups within
the AML samples, we identified a group of genes whose expression
profiles correlated with that of thrombopoietin and found that genes
whose expression associated with AML treatment outcome lie in recurrent
chromosomal locations. Our results are compared with those obtained
using t-tests or Wilcoxon rank sum statistics.
Users
Please
log in to take part in the discussion (add own reviews or comments).