Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Clustering by common friends finds locally significant proteins mediating modules
Speaker:Dr. Bill Andreopoulos
coming from:Biotechnologisches Zentrum, TU-Dresden , Germany
Speakers Bio:
Event Type:Talk
Visibility:D1, D3, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:Public Audience
Date, Time and Location
Date:Wednesday, 16 September 2009
Duration:45 Minutes
Building:E1 4
A challenge in applying density-based clustering algorithms to categorical datasets is that the `cube' of attribute values has no ordering defined. We propose the HIERDENC algorithm, which builds a hierarchy representing the underlying cluster structure of the categorical dataset. HIERDENC minimizes the user-specified input parameters, is insensitive to the order of object input, and can handle outliers. We propose an indexing scheme for HIERDENC for scalable clustering of large datasets.

We present a simplification of HIERDENC, called the MULIC algorithm for multi-layered clustering. We apply this layered clustering algorithm on protein interaction networks, by grouping proteins based on the similarity of their direct neighborhoods. We identify locally significant proteins, called mediators, which link different clusters. Clusters and mediators are organized in hierarchies, where clusters are mediated by and act as mediators for other clusters. We compare the clusters and mediators to known yeast complexes and find
agreement with precision of 71% and recall of 61%.

In other applications, we use HIERDENC to form triangles by complementing protein interaction networks with structural information. Triangles have ten-fold higher overlap
with known yeast complexes than bicliques. Moreover, we visualize protein interaction networks; our success is measured by the percentage of edges collapsed into bicliques,
which is as high as 90%. We also developed a large search database of biomedical images and text captions.

As ongoing work, we are applying HIERDENC to cluster large datasets of Force-Distance curves representing protein unfolding pathways.
Name(s):Conny Liegl
EMail:--email address not disclosed on the web
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Attachments, File(s):
  • Conny Liegl, 09/03/2009 11:49 AM -- Created document.