ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: rChip: Documenting microarray data analysis
P185
Wrobel, Gunnar; Radlwimmer, Bernhard; Lichter, Peter

g.wrobel@dkfz.de
Deutsches Krebsforschungszentrum

The development of genomic and expression profiling techniques has lead to a massive increase in the amount of data generated in molecular biology. In order to quickly analyze and interpret the data, complex analysis algorithms including a variety of different filtering, normalization and clustering methods have been developed. The very complexity of these algorithms, however, makes it difficult to clearly convey study conclusions and performed analyses steps. Consequently, methods and important parameter settings often are poorly explained in microarray publications, making verification of the analysis process or the conclusions drawn difficult if not impossible.

The software package R [1] has become increasingly popular for microarray data analysis, leading to the creation of an R-based open source analysis suite for microarray data called Bioconductor [2]. The latest version of R includes the tool Sweave[3] which provides functionality to combine data, R code and results with documentation.

We use tools from Bioconductor together with Sweave and our own code to create a software package for the analysis of microarray data on the basis of Sweave documents. This tool, called rChip, provides a variety of filtering procedures and two methods of normalization, a standard log ratio normalization to the general median of the chip as well as a variance stabilization[4] method. Data can be presented in a variety of formats such as standard intensity scatter plots and log-ratio to intensity plots.
Whole-chip visualizations of gradients and local hybridization problems are provided for quality control purposes. Currently we are working on including clustering algorithms in the tool.

rChip is controlled by external parameter files and user interaction with R is kept to a minimum allowing inexperienced users to work with the program. The program creates a Sweave file holding a complete protocol of all analyses steps performed on the data. From this file an easy-to-read PDF document presenting all analysis aspects and a concise overview of the results is created. Additionally, the R-source code can be extracted from the original Sweave document if desired. This code then allows an ffortless repetition and modification of the analysis by feeding it into R and providing the appropriate dataset. Currently, rChip is being successfully used by physicians, biologists and bioinformaticians in our laboratory.

We believe rChip provides a helpful interface between scientists focused on data analysis and those focused on biology. Furthermore, it makes possible the independent analyses of microarray experiments by different research groups as long as Sweave files are made available concurrent with data publication. This will facilitate the improvement and verification of the quality of microarray analyses within and between research groups.
[1] R. Ihaka and R. Gentleman. R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5(3): 299-314, 1996.
[2] R. Gentleman and A.J. Rossini. http: //www.bioconductor.org.
[3] F. Leisch. Sweave: Dynamic generation of statistical reports using literate data analysis. In W. Härdle and B. Rönz, editors, Compstat 2002 Proceedings in Computational Statistics, pages 575-580. Physika Verlag, Heidelberg, Germany, 2002.
[4] W. Huber et al. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. To appear in Bioinformatics, 2002.