Campus Event Calendar: Bodo Manthey (12/02/2008 in E1 4/024)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Smoothed Analysis of k-Means Clustering

Bodo Manthey

Max-Planck-Institut für Informatik - D1

AG1 Mittagsseminar (own work)

AG 1, AG 3, AG 5, SWS, AG 2, AG 4, RG1, MMCI

AG Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Tuesday, 2 December 2008

14:00

45 Minutes

E1 4

024

Saarbrücken

Abstract

The k-means method is one of the most popular algorithms for clustering

high-dimensional data into a given number k of classes. However, its good
performance in praxis is at stark contrast to its theoretical performance: In
the worst case, the running-time can be exponential. The goal of smoothed
analysis is to provide an explanation for the good practical performance of
algorithms with poor worst-case performance.

We present a smoothed analysis of the k-means method: An adversary specifies a
point set, and then the points are perturbed with a Gaussian with standard
deviation \sigma. We prove two bounds on the smoothed running-time of the
k-means algorithm: Our first bound is polynomial in n^{\sqrt k} and 1/\sigma,
where n is the number of points. Our second bound is k^{kd} poly(n, 1/\sigma),
where d is the dimension of the instance. This yields a polynomial bound if both
k and d are small compared to n.

Finally, we generalize the analysis of k-means to a large class of other
distance measures. The exponential lower bound for squared Euclidean distances
(Arthur, Vassilvitskii 2006) can be transferred to most Bregman divergences. The
smoothed analysis can also be adapted to (almost) arbitrary Bregman divergences.
Examples of Bregman divergences are Kullback-Leibler divergence or Mahalanobis
distances, the latter includes squared Euclidean distances.

Based on joint work with Heiko Roeglin.

Contact

Khaled Elbassioni

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Khaled Elbassioni, 11/21/2008 12:30 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis