ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Detection of degenerated repeated sequences by means of genomic signature in human chromosomes
P29
Dufraigne, Christine; Fertil, Bernard; Giron, Alain; Deschavanne, Patrick

dufraigne@imed.jussieu.fr
INSERM U494 bd de l'hôpital 75634 PARIS Cedex 13

All genomes contain repeated sequences. The role of these repetitions is not always known and is sometimes controversial, but their studies allow studying the organization and the evolution of genomes [1, 2]. Many methods detect "perfect" repetitions, but few of them allow searching for degenerated repeats [3]. We have set up a fast method for quantifying the similarity between sequences avoiding steps based on alignment.
The method is based on comparison of signatures, defined as the whole set of frequencies of short oligonucleotides (words) composing a DNA sequence. Calculation of signature is very fast thanks to a graphical method: the CGR (Chaos Game Representation) [4, 5]. The method consists in moving a window of a defined size along the genome, and establishing the associated signature [6]. A distance matrix comparing the signatures of each window (Euclidian distance) is then calculated. The CGR-dot-plot obtained can be represented under the form of an image where the color of pixels is related to the distance between fragments, lighter the pixel, closer the signatures.
By simulation, from a human sequence, we established a standard curve evaluating the distance between signatures as a function of the percentage of identity of the two sequences (Figure 1). This curve allows to set up a cut off in order to detect repetition of a defined homology.
This method was first applied to detect intra and inter chromosomal duplications for human chromosomes 21 and 22. From the standard curve, we have selected from the CGR-dot-plot (figure 2) fragments presenting a homology superior to 95 %. In figure 3 are represented repetitions within chromosome 22 and among chromosomes 21 and 22.
We have shown that the use of sequence signature allows a quick detection of degenerated repeated sequences (perfect repetitions are also detected). This method allows quick analyses of long sequences and is able to detect large duplicated regions.
[1] Robert Friedman and Austin L.Hugues, Pattern and timing of gene duplication in animal genome, Genome Res. 2001; 11 (11):1842-7.
[2] Colin Semple, Kenneth H. Wolfe, Gene duplication and gene conversion in the Caenorhabditis elegans Genome, J. Mol. Evol. (1999), 48:555-564.
[3] Robert Friedman and Austin L.Hugues, Gene duplication and the structure of Eucaryotic genomes, Genome Res. 2001 11: 373-38.
[4] Deschavanne P., Giron A., Vilain J., Dufraigne C. and Fertil B., Genomic signature is preserved in short DNA fragments. BIBE 2000 IEEE, Washington USA, 8-10 November 2000 p 161-167.
[5] Deschavanne P., Giron A., Vilain J., Fagot G. and Fertil B Genomic signature: characterization and classification of species assessed by Chaos Game Representation of sequences. Mol. Biol. Evol. 16 (10): 1391-9 (1999).
[6] Dufraigne C., Fertil B, Giron A et Deschavanne P, Utilisation de la signature génomique pour la recherche de transferts horizonatux, JOBIM 2001, Toulouse, 30 mai-1 juin 2001 p 161-167.