The analysis of gene expression data is an important tool for
understanding mechanisms of living systems. Microarray experiments
provide large amounts of data, but it is difficult to understand
biological processes or molecular functions from such data on
their own.
Clustering methods based on expression data have been used to deal
with this problem, but the biological relevance of the results is
limited. New methods have been proposed in which annotations from
the Gene-Ontology (GO) database are integrated into the analysis in
order to gain biological understanding. Gene classes obtained from
GO are scored with respect to their significance in data sets from
microarray experiments.
Current methods of this type do not incorporate the structure of the
GO database when computing statistical quantities. We will develop
statistical and graph-theoretic methods that make use of this
topology in order to improve the biological insight obtained from
gene expression.