Gailus-Durner, Valérie (1);Gößling, Frank (2);Crass, Torsten (2,*) - Guided Gene Regulation Data Analysis Within the Helmholtz Network for Bioinformatics

ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Guided Gene Regulation Data Analysis Within the Helmholtz Network for Bioinformatics	P43
*Gailus-Durner, Valérie (1); Gößling, Frank (2); Crass, Torsten (2,) torsten.crass@eBiology.de** (1) Institute of Experimental Genetics, GSF Forschungszentrum für Umwelt und Gesundheit, Neuherberg; (2) Research Group Bioinformatics (AG BIN), Gesellschaft für Biotechnologische Forschung mbH (GBF), Braunschweig; (*) to whom correspondence should be addressed

Many of the currently available web interfaces for bioinformatics programs and databases suffer from severe shortcomings in user interaction, preventing them from being beneficial for the majority of bio scientists. We have identified three main shortcomings of typical bioinformatics web interfaces: 1) low accessibility, in terms of resource scattering over numerous web sites; 2) low usability due to unintuitive user interfaces; 3) low interoperability, requiring manual data reformatting steps if more than one tool is applied. These drawbacks have their roots in the commonly found tool-centric view of user interaction, leading to web forms merely representing the program's command-line interface using html input elements. The Helmholtz Network for Bioinformatics (HNB), a joint venture of leading German bioinformatics research groups, aims at overcoming these problems by focusing on problem- and task oriented user interface design.
As a means for aiding the user in identifying suitable resources, a so-called "Guided Solution Finder" (GSF) is provided by an explorer-like decision tree with simple questions as inner nodes, whereas the leaves are linked to the resource considered by HNB scientists to be the most suitable one for solving the problem characterised by the user's path through the tree. So the user doesn't need to know in advance which tool to use, but can concentrate entirely on describing the problem to be solved.
Since complex problems will frequently require the interplay of more than one tool, HNB developers have combined stand-alone resources into automated tool cascades distributed over several HNB servers, if necessary. Such cascades have been implemented by the HNB's Gene Regulation and the Protein Analysis subgroups; here we focus on the gene regulation part. Implementation was done in object-oriented Perl, allowing to easily integrate new tools by subclassing from a generic Task class. Using this framework, prototypic tasks for the detection and analysis of regulatory DNA regions have been implemented, utilising PromoterInspector [1], PatSearch [2], MatInspector [3], the TRANSFAC database [4] and the BioRS system as underlying resources. To prevent the user from having to deal with program parameters, all tools have been preconfigured with suitable parameter values.
The standardised data model used for (SOAP-based) data transfer between gene regulation tasks and their subtasks is also based on a Perl class hierarchy. To avoid unnecessary network traffic, the individual tool call?s result data are stored locally as persistent Perl objects until they are requested by the caller.
The complex and distributed nature of both task runs and data storage, however, remains entirely transparent to the user who interacts with the framework solely through a web interface called the "Virtual UserSpace" (VUS). On the one hand, the VUS allows the user to survey all data entered as input or generated as task run output ("data view"). Since one task's output may be re-used as another task's input, and any one data set can act as input for several task runs, the mapping between data and their corresponding tasks requires a second way of visualision: in the 2task view", the VUS displays a user's task runs, each being unambigously described by its input data, its output data and the parameter values used for calling the tools comprising the task.
Typically, a HNB gene regulation session would look as follows: First the user will be guided by the GSF towards suitable HNB resources. After selecting a specific task, the user can enter his/her data or select for re-use any old data entry of matching data type. At this point, advanced users might like to modify the parameter values passed to the tools utilised within the task, or choose a parameter set of previous runs for re-use. After launching the task, VUS entries for the task run and its input data are generated; once the task is finished, its output is also registered into the VUS and the result data are displayed by a viwer script. If more than one viewer script is available for the current data type, the user may now choose a different visualisation. Additionally, the user can choose from a list of possible follow-up tasks for his/her data. Depending on the type of the previous task, some of the possible follow-ups may be marked as especially recommended for a feature that adds context-sensitiv help to the interface. Generally, the seamless integration of guided tool finding, data analysis and data visualisation, as achieved by the HNB, allows to experimentally explore the information content of biological data and will be appreciated especially by unexperienced users.

[1] Quandt, K., Frech,K., Karas,H., Wingender,E. and Werner,T. (1995) MatInd and MatInspector ? New fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res., 23, 4878-4884.
[2] Scherf,M., Klingenhoff,A. and Werner,T. (2000) Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: A novel context-sensitive approach. J. Mol. Biol., 297, 599-606.
[3] Wingender,E., Chen,X., Fricke,E., Geffers,R., Hehl,R., Liebich,I., Krull,M., Matys,V., Michael,H., Ohnhauser,R., Pruss,M., Schacherer,F., Thiele,S. and Urbach,S. (2001) The TRANSFAC system on gene expression regulation. Nucleic Acids Res., 29, 281-283.
[4] Wingender,E., Karas,H. and Knüppel,R. (1997) TRANSFAC database as a bridge between sequence data libraries and biological function. In Altmann,R.B., Dunker,A.K., Hunter,L. and Klein,T.E. (eds.), Pacific Symposium on Biocomputing '97, World Scientific Publishing Co. Pte. Ltd., Singapore, pp. 477-485.