ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Protein Structure Prediction In The Face-Centered-Cubic Lattice
P179
Will, Sebastian; Backofen, Rolf

will@inf.uni-jena.de
Chair of Bioinformatics, Institute of Computer Science, Friedrich-Schiller Universität Jena

We present the first algorithm for exact protein structure prediction in the FCC-HP-model. FCC-HP is a simplified protein model on the face-centered-cubic lattice (FCC) that models the main force of protein folding, namely the hydrophobic force (as Dill's HP-model, Dill. Biochemistry 1985).

It is possible to employ structure prediction in the FCC-HP-model for structure prediction of real proteins in hierarchical approaches. Equally important, the model is interesting for theoretical investigation of the protein folding process. For both application areas, using the FCC instead of the cubic lattice is a significant advance, since FCC is much more accurate in modeling real proteins (Park and Levitt. JMB 1995) and lacks the parity problem of the cubic lattice (Agarwala et.al. JCB 1997).

The prediction problem for the FCC-HP-model is best viewed as a combinatorial optimization problem. Given a chain (i.e. a sequence) of H- and P-monomers one aims to find structures that maximize the number of contacts between H-monomers. Here, a structure is defined as a self-avoiding walk on the lattice and a contact is formed by every
pair of monomers in unit distance.

We base our approach on the construction of hydrophobic cores. Optimal hydrophobic cores are maximally compact sets of points in the FCC. Note that maximal compactness is equivalent to maximally many contacts. For every optimal structure of an HP-sequence, all the H-monomers lie within a maximally compact core for this sequence and all P-monomers lie outside of this core.

The whole algorithm divides into three steps, which are described separately elsewhere. The first two steps are dedicated to the construction of hydrophobic cores and independent of concrete sequences, where the last step actually predicts structures for given sequences. Hence, the first two steps are precomputation steps, which
are performed only once per core size.

Step 1. Compute upper bounds on the number of contacts in certain classes of cores. In each class, all cores are compatible to common parameters. The parameters, called layer parameter sequences, restrict the distribution of the monomers to the layers of the lattice. The computation is done by a dynamic programming algorithm (Backofen and Will. CPM 2001).

Step 2. Compute all cores of a given size by a constraint-based symmetry excluding search (SES, Backofen and Will. CP 1999), which is strongly constrained by the layer parameter sequences computed in the first step. (Will. PSB 2002)

Step 3. Thread an HP-sequence to given cores in order to get optimal FCC-HP-structures for the sequence. Again, this is done by constraint-based symmetry-excluding search. Here, we have to find a self-avoiding walk of the sequence object to the constraints introduced by the hydrophobic core. (Backofen and Will. CP 2001)

The presented algorithm is the first non-trivial exact one for FCC-HP. We tested the algorithm successfully up to sequence length of 180, which is far beyond the capabilities of previous approaches (i.e., even heuristic ones, like simulated annealing or genetic algorithms. Note that naive full enumeration is impracticable for chain lengths greater than about 15.) This shows, it is now possible to exactly predict optimal FCC-HP structures of reasonable sizes compared to real proteins. As expressed by the term exactly, we actually prove the global optimality of the predicted structures in contrast to heuristic methods. Nevertheless, the algorithm predicts structures, based on precomputed cores, usually rather fast ranging from seconds to minutes.
[1] R. Agarwala, S. Batzoglou, V. Dancik, S. E. Decatur, S., Hannenhalli, M. Farach, S. Muthukrishnan, and S. Skiena. Local rules for protein folding on a triangular lattice and generalized hydrophobicity in the HP model. Journal of Computational Biology, 4(3):275 96, 1997.
[2] R. Backofen and S. Will. Excluding symmetries in constraint-based search. Journal of Constraints. In press.
[3] R. Backofen and S. Will. Fast, constraint-based threading of HP-sequences to hydrophobic cores. In Proceedings of 7th International Conference on Principle and Practice of Constraint Programming (CP 2001), LNCS, Berlin, 2001.
[4] R. Backofen and S. Will. Optimally compact finite sphere packings hydrophobic cores in the FCC. In Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching (CPM2001), LNCS, Berlin, 2001.
[5] K. A. Dill. Theory of the folding and stability of globular proteins. Biochemistry 24(6):1501-1509, 1985.
[6] B. H. Park and M. Levitt. The complexity and accuracy of discrete state models of protein structure. Journal of Molecular Biology, 249(2):493 507, 1995.
[7] S. Will. Constraint-based hydrophobic core construction for protein structure prediction in the face-centered-cubic lattice. In PacificSymposium on Biocomputing (PSB 2002), 2002.