Campus Event Calendar: Claudia Plant (03/10/2009 in E1 4/024)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Parameter-free Clustering

Claudia Plant

UMIT - Private Universität für Gesundheitswissenschaften, Medizinische Informatik und Technik, Hall i. Tirol

Talk

AG 1, AG 3, AG 4, AG 5, SWS, RG1, MMCI

Public Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Tuesday, 10 March 2009

11:15

45 Minutes

E1 4

024

Saarbrücken

Abstract

Technological progress opens up novel possibilities in many applications. For example in biology and medicine, it is possible to study cells, tissue and organisms in unprecedented accuracy with high‐throughput mass spectrometry or high resolution imaging. To make best use of the information potentially contained in the data, effective and efficient data mining methods are required. Clustering is among the most important tasks within unsupervised data mining.

Clustering algorithms generate a partitioning of the data into groups or clusters, such that the data objects assigned to a common cluster are as similar as possible and the data objects assigned to different clusters differ as much as possible. By performing a cluster analysis, the user can ideally gain an overview on the major characteristics of a data set without any previous knowledge. However, in practice, performing a cluster analysis is often not easy, since most clustering algorithms require numerous input parameters. Without background knowledge on the data, it is often difficult to find a suitable parameterization. Often, parameters need to be adjusted in a time consuming trial and error procedure. It cannot be guaranteed that a useful parameterization can be detected by doing so. Outliers and noise points in real‐world data additionally complicate the search for a suitable parameterization.

In this talk, I will discuss some novel approaches which are important milestones on the way to parameter‐free clustering. The basic idea of these techniques is to relate clustering to data compression. A good clustering is a clustering summarizing the major characteristics in the data and thus allows for effectively compressing the data. Based on this principle also known as Minimum Description Length, the algorithm RIC (Robust information‐theoretic Clustering) introduces a quality criterion for clustering to improve an arbitrary initial clustering, for example an imperfect clustering obtained with inappropriate parameterization. In addition, RIC provides effective and efficient algorithms for identification of noise points and outliers. The algorithm OCI (Outlierrobust Clustering using Independent Components) is a standalone algorithm for parameter‐free clustering. OCI relies on a very general cluster notion supported by the Exponential Power Distribution and Independent Component Analysis and provides effective clustering of non‐Gaussian data.

A brief survey on my further research areas including semi‐supervised and supervised learning concludes this talk.

Contact

Conny Liegl

302-70150

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Conny Liegl, 03/09/2009 13:42
Conny Liegl, 03/09/2009 13:42 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis