MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

A Simple Coreset Construction for Clustering

Christian Sohler
Heinz-Nixdorf-Institut Paderborn
Talk
AG 1, AG 2, AG 3, AG 4, AG 5  
AG Audience
English

Date, Time and Location

Friday, 27 January 2006
13:30
45 Minutes
46.1 - MPII
024
Saarbrücken

Abstract

In this talk we introduce a simple coreset construction for clustering problems in the Euclidean space like k-median

and k-means clustering. In k-median clustering we are given a set of point P in the Euclidean space R^d and the goal
is to find k centers such that the sum of distances over all input points to the nearest center is minimized. A
coreset is a small weighted set of points such that for every set C of k centers, the (weighted) cost for the
coreset points is approximately the same as the cost for P.

Using this construction we develop a polylog-space semi-dynamic (insertion-only) data structure that maintains a
(1+epsilon)-approximation
to the k-median and k-means clustering. Using the same construction we can also obtain a polylog-space dynamic data
structure that maintains a (1+epsilon)-approximation for these problems.

Finally, we show that one can use the coreset construction to obtain a fast implementation of a k-means clustering
algorithm for low dimensional input instances.

Contact

Benjamin Doerr
-104
--email hidden
passcode not visible
logged in users only

Benjamin Doerr, 01/16/2006 16:06
Benjamin Doerr, 12/18/2005 20:18 -- Created document.