Repsilber, Dirk;Kim, Jan T. - Inferring Coarse-Grained Models of Regulatory Gene Networks from Dynamic Expression Data

ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Inferring Coarse-Grained Models of Regulatory Gene Networks from Dynamic Expression Data	P133
Repsilber, Dirk; Kim, Jan T. Dirk.Repsilber@ebc.uu.se, kim@inb.uni-luebeck.de, kim@inb.uni-luebeck.de Dept. of Molecular Evolution, University of Uppsala, Sweden, and Institute for Neuro- and Bioinformatics, University of Lübeck, Germany

Regulatory gene networks are the central mechanism which realize phenotypic processes and traits on the basis of information stored in the genome. Inferring and understanding these networks is therefore a central focus of bioinformatics today. Microarray technology provides us with access to dynamic expression data. However, even with the unprecedentedly massive amounts of expression data that become available today, regulatory network inference remains a difficult and challenging problem. This can be seen as a reverse engineering challenge: Given a set of data describing inputs and corresponding outputs of a system (i.e. various conditions and orresponding expression dynamics), determine the underlying system structure (i.e. the regulatory network).

However, regulatory networks defy direct, "brute-force" reconstruction from microarray data, because even the most massive amounts of data become insufficient compared to the number of networks that can be constructed with a given number of genes and regulatory interactions grows exponentially with network size. Moreover, expression data obtained through microarray analysis are noisy and biased. Both issues can be addressed by considering coarse-grained network models which use only a few discrete activation states for a gene instead of a continuous level of activation. With such a coarse-grained model, the search space becomes computationally manageable (even though in principle, it is still subject to exponential growth with system size). With a sufficiently coarse-grained discretisation, most data points may be mapped to the right discrete state despite substantial noise in activation level measurement. But on the other hand, important questions need to be addressed when discrete modelling approaches are used:

- Can the discrete network model represent all properties of the true network? What are the limitations?

- How does the method of data discretisation affect the results of discrete network modelling?

In this contribution, a framework for systematically investigating these issues is introduced and initial results are presented. The regulatory network simulator transsys [1] is used for simulating the dynamics of the regulatory network to be reconstructed. The transsys software package has been extended by modules to simulate the process of microarray data generation. These simulated data are subjected to discretisation. Then, the HypoNet algorithm [2] for reverse engineering regulatory network is applied to the discretised data. Finally, the results of reverse engineering can be assessed by comparing them to the transsys program representing the true network. This framework, depicted in Fig. 1, is unique because it assays a heterologous reconstruction scenario, i.e. a case where the process generating the data operates with continuous
activation levels and thus belongs to a different class of processes than the model used for reconstruction.

The transsys programs representing the true network can be assembled based on empirical data, or alternatively, they can be generated automatically at random. As a criterion for generating transsys programs with realistic properties, the sum of absolute values of up- or down-regulation, called regulatory strength, is employed. The parameters for transsys network generation are chosen such that the regulatory strength distribution of the transsys program resembles the distribution observed in empirical data [3].

Initial results of network reconstruction with HypoNet reveal various difficulties which are encountered in coarse-grained reverse engineering. Fig. 2 shows a transsys network with just two genes. 50 independent HypoNet runs were performed on discretised data generated with this transsys network. When HypoNet is run with data generated with a
discrete network model, perfect phenotypic reconstruction (fitness 100%) is typically achieved, i.e. a network capable of identical reproduction of the training data is found. Within the heterologous reconstruction scenario presented here, a maximal fitness value of 84% is reached. This is due to sensitive dependence on initial conditions: different transsys system states may map to the same discretised system states, but the discretised images of subsequent states may diverge. Such processes cannot be captured by the discrete network model currently used by HypoNet, and, hence, are an example for those critical aspects of coarse-grained reverse engineering we search to investigate in our approach.

Figures 3a to 3d display results of independent HypoNet reconstruction runs. All four network reconstructions have the same fitness value of 84%. However, only one reconstructed network (Fig. 3a [r003]) exhibits the correct topology. This suggests that a sizeable subset of networks with fitness 84% is present within the search space of discrete networks. It thus appears that the impossibility of perfectly reproducing the discretised transsys dynamics within the space of discrete network searched by HypoNet results in a serious loss of specificity in reconstruction. The framework presented here allows systematic investigation of this phenomenon, and such investigations are underway.

(For the figures see web representation of the poster abstract)

[1] J.T. Kim (2001) "transsys: A Generic Formalism for Modelling Regulatory Networks in Morphogenesis." In J. Kelemen and P. Sosik (eds): Advances in Artificial Life, 242-251. Springer Verlag, Berlin, Heidelberg.
[2] D. Repsilber, H. Liljenstroem and S.G.E. Andersson (2002) "Reverse Engineering of Regulatory Networks: Simulation Studies on a Genetic Algorithm Approach for Ranking Hypotheses." To appear in BioSystems.
[3] A.P. Gasch, P.T. Spellmann, C.M. Kao, O. Carmel-Harel, M. Eisen, G. Storz, D. Botstein and P.O. Brown (2000) "Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes." Molecular Biology of the Cell 11: 4241-4257.