ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: A comparative study of Amino Acids binary codes
P41
Fu, Huaiguo; Bouita, Belkhacem; Mephu Nguifo, Engelbert

fu@cril.univ-artois.fr, mephu@cril.univ-artois.fr
CRIL - CNRS FRE2499, Université d'artois

We study four kind of binary codes of amino acids (AA). Two codes are respectively based on physico-chemical properties [3, 4], and the two others are generated with artificial intelligence (AI) methods and are based on protein structures and alignment [2], and on Dayhoff matrix [6]. In order to give a global significance of each binary code, we use a hierarchical clustering method to generate different clusters of AA. Each cluster was examined with biochemical properties to give an explanation on the similarity between AA it contains. To validate our examination, a decision tree based machine learning system is used to characterize the AA clusters obtained with each binary code. From this experimentation, it comes out that one of the AI based code allows to obtain clusters that have significant biochemical properties. As a consequence, it appears that even if attributes of binary codes generated with AI methods, do not separately correspond to a biochemical property, they can be significant in the whole. Conversely binary codes based on physico-chemical properties can be insignificant when forming a whole.
This work could allow to take into account biochemical properties of AA when binary codes are used to redescribe protein primary sequences onto the protein folding problem [5].
[1] Dayhoff M.O, 1972, Atlas of Protein Sequence and Structure.
[2] De la Maza, 1994, Generate, Test and Explain: Synthesizing, Regularity Exposing attributes in Large Protein Databases. In 27 Hawain intl conf on system science (HICSS), 123-129.
[3] Sallantin J., Marlière P. & Saurin W., 1984, Point Curie, 141-153.
[4] Taylor W.R., 1986, The Classification of amino-acid conservation. Journal of theoretical biology, 119:205-221
[5] Landès C. & al., 1996, The Gene 2 of the Sigma Rhadovirus Genome encodes the P Protein, the Gene 3 encodes a protein related to the reverse transcriptases of retroelements. in Virology, 215:123-142
[6] Gracy J. & Mephu Nguifo E., 1994, Technical Report LIRMM.
[7] Fu H., 2001, Intelligence Artificielle et codage de séquences de protéines. Mémoire de DEA, CRIL, Université de Lille 1.