ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Development of Annotation Tools in the W3H-Task System
P26
del Val, Coral; Arunachalam, Vinayagam; Glatting, Karl-Heinz; Suhai, Sándor

c.delval@dkfz.de
Deutsches Krebsforschungszentrum (DKFZ), Abt. Mol. Biophysik,

Research work within the different genome projects is currently shifting from sequencing towards developing and improving approaches for genome annotation. The tools and resources for annotation are developing rapidly, and the scientific community is becoming increasingly reliant on this information for all aspects of biological research. The aim of a high-quality annotation is to identify the key genome features - in particular, the genes and their products in order to bridge the considerable gap between large-scale data-collection and its interpretation. In this context we have developed different analysis tools for the annotation of genomes. cDNA mapping, semi-automatical analysis of EST sequences supporting the search of functional annotations of novel transcript sequences (ESTSannotator), correct protein domain function assignment for uncharacterised protein sequences using the Interpro database (DomainSweep), protein secondary structure prediction (2Dsweep), design of oligos (PrimerSweep) etc...
One of these tools is cDNA2Genome, an annotation tool that maps cDNAs to the human genome. cDNA2Genome gives information about the chromosomal and contig location of the cDNA. It extracts the genomic sequence where the cDNA is located and predicts the exons and introns in this region in both strands using tools likeGenScan, HMMgene, GeneID, GeneWise and Sim4. Additionally the genomic sequence is used as input for different homology searches with the aim of getting more evidences for the annotation process. This includes comparisons against the mouse genomic database, against EST databases (our EST databases contains all the unified expressed sequences contained in EMBL, Genbank, and their updates), against mRNA databases, UniGene, Swissprot and cDNA databases.
The result of cDNA2Genome is summarized in a graphical output that displays the results from each homology search and from each prediction program in a comprehensible way. The researcher has immediate access to all complete application outputs and database entries. Additionally the relevant results are presented in a descriptive text summary. New releases of annotation tasks in our system forseen the prediction of gene function and structural roles using the Gene Ontology database.
cDNA2Genome has been implemented under the W3H-Task-System (W3H-Task-System, Bioinformatics 2002). This framework allows the integration of applications and methods to create tailor-made analysis task flows, which can be used in high throughput analysis. The meta-data approach of the W3H-Task-System allows the immediate integration of cDNA2Genome into the W2H web interface (Senger et al., 1998), which is the graphical WWW interface to HUSAR (Heidelberg Unix Sequence Analysis Resources) (http://www.w2h.dkfz-heidelberg.de). Availability: cDNA2Genome is available in the HUSAR system at the German EMBNET node.
Contact: genome@dkfz.de. phone 06221/42-2349.
home page: WWW at http://genome-dkfz-heidelberg.de
[1] Ernst, P., Glatting, K., Suhai, S. A task framework for the web interface W2H. Bioinformatics (in press).
[2] del Val, C. Bräuning, R., Glatting, K. and Sándor Suhai PATH: a task for the inference of phylogenies. Bioinformatics. vol. 18, pg:646-647, 2002.
[3] Senger, M., Flores, T., Glatting, K., Ernst, P., Hotz-Wagenblatt, A., Suhai, S. (1998) W2H: WWW interface to the GCG sequence analysis package. Bioinformatics, 14, 452-7.