Many researchers in protein structure have proposed various ways of
classifying residues in a protein into discrete classes, based on the
structure of the protein around the residue. For example, there are
many slightly different definitions of secondary structure.
In this talk we will look at a way of quantifying the usefulness of
these alphabets for predicting protein structure from sequence.
We consider
conservation
predictability
usefulness in fold-recognition
usefulness in alignment
Our initial study uses only backbone-geometry alphabets (we're still
working on solvent accessibility alphabets), and finds that all the
alphabets we examined make similar, but not identical, improvements in
fold-recognition, but that the best alphabets for fold-recognition are
not the same as the best for alignment.