ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Compact Gene Index : Clustering public protein coding sequences for microarray specific oligonucleotide design
P17
Brett, David; Weber, Jaqueline; Hoerster, Andrea; Bell, Robert; Drescher,Bernd

dbrett@mwgdna.com
MWG-Biotech AG Ebersberg,Germany

The public sequence data representing a particular gene transcript is heterogeneous. There are normally multiple entries for the same mRNA submitted from numerous sources. Many of which represent partial or duplicate sequence. Within this collection of sequences there are polymorphic sites (SNPs), alternative splice forms and possibly sequencing errors. A number of projects have been initiated to produce reference consensus sequences for gene data like Unigene and RefSeq from the NCBI and Ensembl from the Sanger center. Here we present CodeSeq a database of public sequence data which clusters the many differing sequences to produce a consensual alignment. The user can visualize all the sequence information in CodeSeq in a non-redundant fashion called the Compact Gene Index (CGX). The CGX can locate areas of discrepancy, polymorphic sites and possible alternative splice forms. The clustering is centered around the peptide coding sequence and for each sequence within the cluster an appropriate start and stop site are located. The information in the CGX can be used directly by oligonucleotide design programs to avoid areas of conflict, choose specific SNPs or splice forms. CGX can be utilized to design oligonucleotides for microarrays, real time PCR primers, or siRNAs. In all three cases locating regions of sequence agreement and polymorphic sites is critical to successful design. Oligonucleotide microarray design programs using both Smith-Waterman and BLAST similarity searching algorthms can locate sub optimal sequences containing cross-hybridizing domains that have similar properties and could influence signal to noise ratios in microarray experiments. The CGX shows such information in the form of a web based table per designed oligo.