ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Comparison of Various Normalisation Strategies for Microarray Analysis
P158
Steinhoff, C.; Nuber, U.A.; Vingron, Martin

steinhof@molgen.mpg.de
Max Planck Institut für Molekulare Genetik, Berlin, Germany

In microarray experiments the expression level of thousands of genes are being measured simultaneously. Due to the number of variable experimental steps such as probe acquisition, preparation, labelling, hybridisation and scanning procedures the resulting data are highly variable, very noisy and have no fixed scale. These systematic variation effects - when not detected and analysed - will affect further data analysis and interpretation of the data. Thus in order to make microarray experiments interpretable and to compare experiments these effects should be detected and removed.
In literature there have been a number of attempts to fix these problems. Still, up to now there is neither a consensus about the use of the different existing normalisation methods nor an overall comparison of the different types of analysis.
Furthermore it is questionable to what extent different normalisation strategies influence different types of analysis as the detection of differential genes, classification etc. as well as further biological interpretation.
We examined a number of normalisation strategies and applied them to a repetition series of dye swap experiments. We performed an experimental setting where we first applied various normalisation methods and then studied the effect of the choice of the strategy regarding the detection of differential genes. For that purpose we performed northern blots of a subset of those genes which were detected to be differential by different methods.
Thus we want to contrast detection of outliers due to different preceding normalisation methods by outlier detection using northern blotting.

Overall normalisation methods can be divided into two groups: those which are using subsets of spotted sequences (for example housekeeping genes, spiked controls etc.) for normalising the whole dataset and those which are using the whole chip for normalisation. The latter one assumes that most genes remain unchanged when considering a sample-control-setting. We focused on these methods.
These can be devided into (a) scaling methods (b) methods which are detecting and removing non linearities (c) methods analysing for various influencing factors (d) methods analysing and stabilising for variable variance across the spectrum of intensities.

(a) Scaling methods only adjust for overall and linear effects across the slides. Dividing the raw intensities by overall mean, median or shorth or applying linear regression has been used in different studies.
(b) In order to correct for curved appearance of log-data, Dudoit et al. [1] estimated a smooth normalising curve by local regression.
(c) Kerr et al. [2] proposed an ANOVA approach to model log intensities. They analysed for specific effects due to probe, dye, array, gene and probe-gene, array-gene interaction. The parameters are being estimated by maximum likelihood.
(d) The variance of log intensities of microarray experiments tend to increase along decreasing mean of log intensities, so the variance is highly dependent on the mean value. To fix this problem it has been proposed [3] to estimate the normalised expression levels and the expression level dependent error variance by local regression. Huber et al. [4] proposed a variance stabilizing transformation to obtain log transformed data with approximate constant variance across the dataset.

We used three repetitions of dye swap experiments and applied the normalisation strategies mentioned above. A list of 46 those genes which were classified to be differential according to the different normalisations were subjected to northern blot analysis. Results from northern blotting and preceding detection of outliers from different methods were correlated. Regarding the different lists of outliers due to different normalisations were compared by defining distance measures on rank ordered differential genes. Thus methods can be grouped according to similar outlier detection. Using northern blotting as an independent method we can contrast normalisation methods which tend to describe the biological background best.
[1] Dudoit S, Yang YH, Speed TP, Callow MJ; Statistica Sinica, 12:111-139, 2002.
[2] Kerr MK, Martin M, Churchill GA; Journal of Computational Biology, 7:819-837, 2000.
[3] Kepler TB, Cosby L, Morgan KT; GenomeBiology, 3(7): research0037.1-0037.12, 2002.
[4] Huber W, v Heydebreck A, Sültmann H, Poustka A, Vingron M; Bioinformatics, ISMB, 2002.