ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Evaluation of methods for the searching RNA motifs task
P163
Thebault, Patricia; Allouche, David; Gaspin, Christine

pat@tlse.toulouse.inra.fr, allouche@tlse.toulouse.inra.fr, gaspin@tlse.toulouse.inra.fr
INRA/UBIA (Institut National de Recherche Argonomique)

In the post-genome era, a major challenge is to exploit the large amount of information available [6] by locating family of molecules for the annotation of coding and non-coding genes.

Non coding functional RNA location presents a particularly interesting problem, as detection via similarity search methods often fails. Indeed many functional RNA genes tend to conserve a common base-paired secondary structure better than a consensus primary sequence making it possible to describe them as structured motifs. Motifs combine secondary structure elements and sequence elements with constraints applied (contents information, distance, correlation, allowed mismatches...) on them. Searching for such motifs entail effective methods using primary and secondary structure information and two major classes of computer methods exist to locate them.

Specific softwares are dedicated to a particular molecules family (tRNAs, snoRNA genes ...). They implement both the specific description and the algorithm to search for these molecules [8, 5, 9].

On the other side, generalized computational methods, where formalizing the motif is in charge of the user, present the advantage to be applicable to any motif as long as the language provided is powerful enough to describe the molecule of interest [2, 7, 1, 4].

For a better understanding of limits of softwares in both classes, we have compared them. We present in here the results of an empirical study focusing on two types of structured molecules :

- Transfer RNAs which all share the well known "clover-leaf" structure. Two genome sequence data (Escherichia coli K-12 and Saccharomyces cerevisiae), the related collections of tRNAs and the Sprinzl's tRNA database ( http://www.uni-bayreuth.de/departments/biochemie/trna/) were used for our tests.

- C/D box snoRNAs which are involved in ribose methylation and contain one or two long 10-21 bp stretches of exact complementarity to other target RNA [3]. The genome of Pyrococcus abyssi and the collection of related snoRNAs were used.

We have evaluated the different methods in terms of sensitivity, selectivity and CPU time. Our results are helpful to better caracterize tools for their expressive power (ability to describe a structured motif with more or less sensitivity) and the performance of implemented algorithms (exhaustivity and computation time). Rules rised from this study will be useful to point out to users appropriate tools depending on the RNA motif problems.
[1] Laferriere A, Gautheret D, and Cedergren R. An rna pattern matching program with enhanced performance and portability. Comput Appl Biosci, 10(2):211-2, Apr 1994.
[2] Billoud B, Kontic M, and Viari A. Palingol: a declarative programming language to describe nucleic acids' secondary structures and to scan sequence database. Nucleic Acids Res, 24(8):1395-403, Apr 1996.
[3] Gaspin C, Cavaille J, Erauso G, and Bachellerie JP. Archaeal homologs of eukaryotic methylation guide small nucleolar rnas: lessons from the pyrococcus genomes. J Mol Biol, 297(4):895-906, Apr 2000.
[4] Gautheret D and Lambert A. Direct rna motif definition and identification from multiple sequence alignments using secondary structure profiles. J Mol Biol, 313(5):1003-11, Nov 2001.
[5] el Mabrouk N and Lisacek F. Very fast identification of rna motifs in genomic dna. application to trna search in the yeast genome. J Mol Biol, 264(1):46-55, Nov 1996.
[6] Salanoubat M, Genin S, Artiguenave F, Gouzy J, Mangenot S, Arlat M, Billault A, Brottier P, Camus JC, Cattolico L, Chandler M, Choisne N, ClaudelRenard C, Cunnac S, Demange N, Gaspin C, Lavie M, Moisan A, Robert C, Saurin W, Schiex T, Siguier P, Thebault P, Whalen M, Wincker P, Levy M, Weissenbach J, and Boucher CA. Genome sequence of the plant pathogen ralstonia solanacearum. Nature, 415(6871):497-502, Jan 2002.
[7] Macke TJ, Ecker DJ, Gutell RR, Gautheret D, Case DA, and Sampath R. Rnamotif, an rna secondary structure definition and search algorithm. Nucleic Acids Res, 29(22):4724-35, Nov 2001.
[8] Lowe TM and Eddy SR. trnascan-se: a program for improved detection of transfer rna genes in genomic sequence. Nucleic Acids Res, 25(5):955-64, Mar 1997.
[9] Lowe TM and Eddy SR. A computational screen for methylation guide snornas in yeast. Science, 283(5405):1168-71, Feb 1999.