ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Applying Data Mining Methods to SELDI-TOF Analysed Renal Cell Carcinoma Samples to Identify Relevant Tumor Markers
P182
Woetzel, Dirk; Driesch, Dominik; Pfaff, Michael; von Eggeling, Ferdinand; Junker, Kerstin; Guthke, Reinhard

BioControl@t-online.de
BioControl Jena GmbH, Wildenbruchstr. 15, D-07745 Jena, Germany

Surface Enhanced Laser Desorption/Ionisation Time-of-Flight (SELDI-TOF) analysis is being increasingly used to analyse clinical samples from various medical backgrounds. At present, this approach aims to identify novel markers for various diseases. Since, however, the data obtained using SELDI-TOF analysis is rather complex, adequate data mining methods are required to identify markers that can be used to distinguish between samples from diseased and healthy tissue.

A rule-based data mining method is described here that includes a special supervised fuzzy clustering algorithm adapted to one-dimensional two-cluster problems. This method was applied to data obtained by SELDI-TOF analysis of 23 histopathologically characterised renal cell carcinoma (RCC) samples. Tissue samples were taken from eight patients and three different locations: i) central tumor ii) peripheral tumor and iii) healthy renal tissue. For one patient, no peripheral sample was available.

In SELDI-TOF analysis the lysated sample material is bound to a special chromatographic chip surface and molecules that do not bind are washed off subsequently. The proteins retained are then ionised by a laser and accelerated within an electric field. Masses and intensities that form protein spectra are determined by a time-of-flight detector. The intensity at each peak of the spectrum can be considered to represent the concentration of a certain protein in the sample. Peaks that are typical for a set of samples are detected and numbered. Depending on the chemical properties of the chip surface, different substance classes can be detected. In this study a strong anion exchange surface chip (Ciphergen) was used. The SELDI-TOF analysis of renal cell carcinoma samples carried out here yielded 44 typical peaks ranging from 2,947 to 16,450 kDa.

For the data mining method applied here, central and peripheral tumor samples were both labelled as 'tumor' and samples from healthy renal tissue were labelled as 'normal'. The intensities at each of the 44 peaks were logarithmized and clustered by the above mentioned clustering algorithm for all 23 samples. Based on these clustering results, lists of rules were generated. These rules describe relations such as "IF intensity at peak 2,947 kDa is in cluster 'high concentration' THEN tissue is 'normal'". These rules were rated and ranked using a statistics-based rating measure.

Applying this data mining method, three rules were extracted from the rule lists that describe the data set with an accuracy of 95.65 % (1 error in 23 samples). The possibility (alpha) that the three extracted rules are actually not relevant to distinguish between the carcinoma and normal tissue samples was less than 0.135. Considering the limited number of samples analysed here, the approach applied demonstrates that SELDI-TOF analysis in combination with advanced data mining methods can be successfully used to identify novel markers that distinguish between renal cell carcinoma tissue and healthy renal tissue. Larger data sets will have to be analysed in future studies to further validate these findings.