ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Discovery of protein complexes in the network of protein interactions
P155
Spirin, Victor; Zhao, Dacheng; Mirny, Leonid

spirin@mit.edu, zhao@mit.edu, leonid@mit.edu
Massachusetts Institute of Technology

Understanding the structure and dynamics of biological networks is one of the fundamental problems in systems biology. Recently several groups have studied experimentally derived network [1,2] of protein interactions revealing its scale-free architecture [3,4] and unusual two-body correlations [5]. In this study we focused on multi-body interactions in protein network and discovered clusters of proteins that have many interactions between themselves. Such clusters have simple biological interpretation - they correspond to protein complexes.

While performing most biochemical functions in a cell, proteins do not work alone. Instead they often become parts of more complicated molecular machines - protein complexes. Such complexes can be as small as a heterodimer of two proteins or as big and complicated as a ribosome that consists of dozens of different proteins. Cataloging and understanding protein complexes is an important step towards revealing the cellular regulatory network. The new challenge of proteomics research is to reveal components and structures of protein complexes [6].

Here we present a discovery of protein complexes in the network of protein interactions. Taken together, protein interactions constitute a graph, where vertices represent proteins and edges - interactions between the proteins. The key idea of our method is that a protein complex should have many interactions between its proteins and hence constitute a highly-connected subgraph in the graph of protein interactions.

We used three different methods to search for highly-connected sets of proteins in the network of protein interactions. The first method identified all complete graphs (cliques) in the network. The second and third methods identified highly-connected subgraphs that are not necessarily cliques. The second method uses uper-paramagnetic Clustering, an algorithm developed by Domany and coworkers to cluster objects in a non-metric space of an arbitrary dimension [7]. As a third method, we developed a novel Monte-Carlo optimization technique to identify a highly-connected subgraphs in an arbitrary network. We estimated statistical significance of discovered complexes by a rigorous statistical test and by a random re-wiring of the graph.

Using MIPS database [8], we constructed the network of protein interactions in S.cerevisiae that included interactions found by large-scale two-hybrid experiment [1,2] and interactions reported in other biological literature. Our analysis predicted more than 50 non-overlapping protein complexes that have from 4 to 24 proteins each.

Strikingly, most of the proteins in these complexes have consistent functional annotation revealing the function of the whole complex. Comparison of predicted complexes with the known ones [8-11] shows very good agreement both in terms coverage and specificity of our predictions. We also predicted several novel complexes and pathways.

Computational prediction of protein complexes can efficiently complement experimental proteomics efforts. Our computational techniques can identify complexes of all sizes, including transient complexes and complexes of low stoichiometry, overcoming limitations of experimental purification of protein complexes [11]. Although our technique relies on experimentally derived interactions, the multi-body nature of discovered complexes makes our algorithm robust to the high rate of false-positives in experimental data. Our results (i) suggest several testable biological hypotheses and (ii) reveal an essential multi-body component in the structure of the protein interaction network.
[1] Ito, T. et al. A comprehensive analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569-4574 (2001).
[2] Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623-627 (2000).
[3] Albert, R., Jeong, H., and Barabasi, A.L., Error and attack tolerance of complex networks. Nature 406, 378-382 (2000).
[4] Wagner, A. and Fell, D.A., The small world inside large metabolic networks. Proc R Soc Lond B Biol Sci 268, 1803-1810 (2001).
[5] Maslov, S., and Sneppen, K., Specificity and stability in topology of protein networks. Science 296, 910-913 (2002).
[6] Abbott, A., The society of proteins. Nature 417, 894-896 (2002).
[7] Blatt, M., Wiseman, S., Domany, E., Superparamagnetic clustering of data. Physical Review Letters 76, 3251-3254 (1996).
[8] Mewes, H.W., et. al., MIPS: a database for genomes and protein sequences. Nucleic Acids Res, 30(1), 31-34 (2002).
[9] Bader, G.D., et. al., BIND-The Biomolecular Interaction Network Database. Nucleic Acid Res., 29(1), 242-245 (2001).
[10] Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180-183
[11] Gavin, A.-C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141-147 (2002).