ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: The Helmholtz Network for Bioinformatics - A User-friendly Integrated Bioinformatics Web Portal
P59
HNB Consortium

torsten.crass@eBiology.de
Research Group Bioinformatics, German Biotechnology Research Center (GBF), Mascheroder Weg 1a, 38124 Braunschweig, Germany

1. Introduction
The Helmholtz Network for Bioinformatics (HNB) is a joint venture of the Helmholtz Community of Research Centres (and four centres thereof in particular: GBF, Braunschweig; DKFZ, Heidelberg; GSF, Neuherberg/Munich; MDC, Berlin-Buch) and other German research institutes (Fraunhofer Institute for Algorithms and Scientific Computing, St. Augustin; Institute of Biochemistry, University of Cologne; MPI for Informatics, Saarbrücken; MPI for Molecular Genetics, Berlin; RZPD, Berlin). It aims at providing easy access to numerous high-quality bioinformatics resources on a single web site. To this end, a so-called "Guided Solution Finder" has been developed - a problem- rather than tool-oriented web interface that significantly simplifies the selection of the most appropriate tool(s) for the user's requirements. Currently ~80 resources are available through the HNB portal.
Since complex biological problems often require the execution of more than just a single tool, mechanisms have been developed that allow for the integration of programs and databases into automated cascades of distributed programs (task-oriented approach). Additionally, each task-run's input and output data are registered in a central "virtual user space", allowing users to easily re-use their own data.
User certificates are provided through online registration for those HNB resources that are subject to certain restrictions of use (i.e. "academic only"); however, anonymous access is possible for most resources.

2. Available Resources
Apart from standard bioinformatics applications, like the popular HUSAR package (Senger et al., 1998) and SRS (Etzold et al., 1996), the HNB encompasses the following specific resources:

2.1 Genome analysis
The main focus of this HNB subsection is on prediction and analysis of regulatory regions in eukaryotic genomic sequences. For this purpose, tools developed at GBF and GSF, partly in cooperation with commercial partners, have been integrated into automated genomic annotation cascades.
The TF Scan task e.g. simultaneously submits a DNA sequence to the transcription factor (TF) binding site prediction programs PatSearch (Wingender et al., 1997) and MatInspector (Quandt et al., 1995) and subsequently combines their output into a single result page, where all TF binding site hits are directly linked to the TRANSFAC database (Wingender et al., 2001). The more complex RegRegion Analysis task first utilises PromoterInspector (Scherf et al., 2000) to search a DNA sequence for putative promoter regions and subsequently calls TF Scan for each identified promoter candidate. Future work will encompass the integration of additional genomic annotation software, culminating in an automated gene expression prediction task.

2.2 Protein sequence analysis
The focus of the HNB Protein Analysis subsection is on the prediction of protein features, taking into account the close relationship between protein structure prediction, protein family analysis and protein function prediction. Protein family analysis is performed by searching against the SYSTERS cluster set (Krause et al., 2002), protein function prediction focuses on domain structure prediction using the SMART (Letunic et al., 2002) database, and protein structure prediction is performed using a threading algorithm (Alexandrov et al., 1996; Zien et al., 2000).
A combined summary of the results of all three tools mentioned above can be obtained by submitting a query sequence to a general Protein Analysis task, which, for refinement of the overall analysis, exchanges intermediate results between the different stand-alone tools. A more detailed look into the results is possible by following the links provided on the summary output.

3. Conclusion
The HNB web portal greatly simplifies the handling of numerous valuable bioinformatics resources for both novices and experienced users. It does so by offering a problem-oriented user interface as well as automated tool cascades serving common analysis tasks.
[1] Alexandrov,N., Nussinov,R. and Zimmer,R. (1996) Fast protein fold recognition via sequence to structure alignment and contact capacity potentials. In Hunter,L. and Klein,T.E. (eds.), Pacific Symposium on Biocomputing '96, World Scientific Publishing Co. Pte. Ltd., Singapore, pp. 53-69.
[2] Etzold,T., Ulyanov,A. and Argos,P. (1996) SRS: Information retrieval system for molecular biology data banks. Methods Enzymol., 266, 114-28.
[3] Letunic,I., Goodstadt,L., Dickens,N.J., Doerks,T., Schultz,J., Mott,R., Ciccarelli,F., Copley,R.R., Ponting,C.P. and Bork,P. (2002) Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res., 30, 242-244.
[4] Quandt,K., Frech,K., Karas,H., Wingender,E. and Werner,T. (1995) MatInd and MatInspector - New fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res., 23, 4878-4884.
[5] Scherf,M., Klingenhoff,A. and Werner,T. (2000) Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: A novel context-sensitive approach. J. Mol. Biol., 297, 599-606.
[6] Senger,M., Flores,T., Glatting,K., Ernst, P.Hotz-Wagenblatt,A. and Suhai,S. (1998) W2H: WWW interface to the GCG sequence analysis package. Bioinformatics, 14, 452-457.
[7] Sommer,I., Zien,A., von Öhsen,N., Zimmer,R. and Lengauer,T. (2002) Confidence Measures for Protein Fold Recognition. Bioinformatics, 18, 802-812.
[8] Wingender,E., Chen,X., Fricke,E., Geffers,R., Hehl,R., Liebich,I., Krull,M., Matys,V., Michael,H., Ohnhauser,R., Pruss,M., Schacherer,F., Thiele,S. and Urbach,S. (2001) The TRANSFAC system on gene expression regulation. Nucleic Acids Res., 29, 281-283.
[9] Wingender,E., Karas,H. and Knüppel,R. (1997) TRANSFAC database as a bridge between sequence data libraries and biological function. In Altmann,R.B., Dunker,A.K., Hunter,L. and Klein,T.E. (eds.), Pacific Symposium on Biocomputing '97, World Scientific Publishing Co. Pte. Ltd., Singapore, pp. 477-485.
[10] Zien,A., Zimmer,R. and Lengauer,T. (2000) A simple iterative approach to parameter optimization. J. Comput. Biol., 7, 483-501.