ECCB 2002 Poster sorted by: Author | Number

Next | Previous poster (in order of the view you have selected)

Title: Trying to reconstruct the history of genes families
P99
Marangoni, Roberto; Pisanti, Nadia; Ferragina, Paolo; Frangioni, Antonio; Luccio, Fabrizio

marangon@di.unipi.it
Dept. of Informatics, University of Pisa, Corso Italia 40, 56125 Pisa, Italy

Genomes can be clusterized into families of genes. Often, genes belonging to a family show high homology degree in their sequences and similar, but not identical, biological function: they are called paralogs. It is generally accepted that the number of paralogs in a family increases during biological evolution by means of a mechanism of duplication and modification of the previously existent genes. Different studies have tried to infer the history of these duplication events, trying to identify the ancestor and its derivatives; to do this, they often used common algorithms for phylogenetic trees.
We present a method, called PaTre (Paralogy Trees), that, under the hypothesis that all the genes present in a family are the result of several iterated duplication processes, tries to reconstruct the family history. It gives in output an oriented tree in which each node represents a gene of the family, and each oriented arc represents the relationship matrix-copy within a duplication event.
This method is able to keep into account some biological observations concerning duplication events, in particular:
- newer genes are often shorter than older genes;
- insertions after duplications are really rare events.
The method receives in input all the sequences of genes belonging to a family; then, for each genes couple, it computes an asymmetric "distance" (Transformation Distance, which matches the biological observations stated above) that, evaluates the cost of assuming one sequence as the matrix and the other as the copy of the hypothetical duplication event. The output of this first step is an oriented graph connecting each possible couple: from this graph, PaTre extract the LSA (Lightest Spanning Arborescence), which represents the searched oriented paralogy tree.
PaTre has been applied to study the history of several gene families in lower and higher organisms: the results obtained using PaTre are compared with the results obtained using standard phylogenetic methods and the biological reliability of the two methods is discussed.