ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Evolutionary analysis of protein domain families
P18
Bru, Catherine; Kahn, Daniel

cbru@toulouse.inra.fr, dkahn@toulouse.inra.fr
INRA-CNRS, Toulouse, France

Protein domains are considered as the fundamental units for protein evolution. In the present work we analyse the taxonomic distribution of protein domain families in order to infer the most likely evolutionary scenarios, allowing for domain loss and horizontal transfer. The taxonomic distribution is analysed assuming the species tree is known, as available from the NCBI (Wheeler et al., 2000).
We have derived a probabilistic model for the evolution of protein domains using Bayesian trees (Finn V. Jensen, 1996). Parameters for this model were derived from a set of 61 protein domain families for which detailed taxonomic analysis was available from the literature. This model is being applied systematically to all protein domain families pertaining to complete genomes, as obtained from the ProDom-CG database (Corpet et al., 2000). The number of possible evolutionary scenarios to be considered has exponential complexity with respect to the number of nodes in the taxonomy tree. In order to cope with this complexity, we have developed a dynamic programming algorithm which allows us to infer the most probable evolutionary scenario for each domain family. Each scenario includes the identification of a putative origin for the domain family; the inference of horizontal transfer events; and the recognition of involutive domain losses. This will be illustrated on a few specific domain families.
[1] Corpet, F., Servant, F., Gouzy, J. & Kahn, D. (2000). ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res 28, 267-269.
[2] Finn V. Jensen (1996). An introduction to Bayesian Network. UCL Press.
[3] Wheeler, D.L., Chappey, C., Lash, A.E., Leipe, D.D., Madden, T.L., Schuler, G.D., Tatusova, T.A. & Rapp, B.A. (2000). Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 28, 10-14.