Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop


Show entries of:

this year (2019) | last year (2018) | two years ago (2017) | Notes URL

Action:

login to update

Options:








Author, Editor

Author(s):

Chang, Kevin L.
Kannan, Ravi

dblp
dblp

Not MPG Author(s):

Kannan, Ravi

Editor(s):





BibTeX cite key*:

Chang2006

Title, Booktitle

Title*:

The Space Complexity of Pass-Efficient Algorithms for Clustering


census-soda.ps (423.74 KB)

Booktitle*:

Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA'06

Event, URLs

URL of the conference:

http://www.siam.org/meetings/da06/

URL for downloading the paper:


Event Address*:

Miami, USA

Language:

English

Event Date*
(no longer used):


Organization:


Event Start Date:

21 February 2007

Event End Date:

21 February 2007

Publisher

Name*:

ACM / Siam

URL:


Address*:

New York, USA

Type:


Vol, No, Year, pp.

Series:


Volume:


Number:


Month:

January

Pages:

1157-1166

Year*:

2006

VG Wort Pages:


ISBN/ISSN:

0-89871-605-5

Sequence Number:


DOI:




Note, Abstract, ©

Note:

This is the conference version. Full version under submission to journal.

(LaTeX) Abstract:

We present multiple pass streaming algorithms for a basic clustering problem for massive data sets. If our algorithm is allotted 2l passes, it will produce an approximation with error at most ε using Õ(k3/ε2/l) bits of memory, the most critical resource for streaming computation. We demonstrate that this tradeoff between passes and memory allotted is intrinsic to the problem and model of computation by proving lower bounds on the memory requirements of any l pass randomized algorithm that are nearly matched by our upper bounds. To the best of our knowledge, this is the first time nearly matching bounds have been proved for such an exponential tradeoff for randomized computation.In this problem, we are given a set of n points drawn randomly according to a mixture of k uniform distributions and wish to approximate the density function of the mixture. The points are placed in a datastream (possibly in adversarial order), which may only be read sequentially by the algorithm. We argue that this models, among others, the datastream produced by a national census of the incomes of all citizens.

Keywords:

algorithms, streaming, massive data set, clustering, statistics

Copyright Message:

Copyright 2006 ACM. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.


Download
Access Level:

Public

Correlation

MPG Unit:

Max-Planck-Institut für Informatik



MPG Subunit:

Algorithms and Complexity Group

Appearance:

MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort



BibTeX Entry:

@INPROCEEDINGS{Chang2006,
AUTHOR = {Chang, Kevin L. and Kannan, Ravi},
TITLE = {The Space Complexity of Pass-Efficient Algorithms for Clustering},
BOOKTITLE = {Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA'06},
PUBLISHER = {ACM / Siam},
YEAR = {2006},
PAGES = {1157--1166},
ADDRESS = {Miami, USA},
MONTH = {January},
ISBN = {0-89871-605-5},
NOTE = {This is the conference version. Full version under submission to journal.},
}


Entry last modified by Christine Kiesel, 04/30/2007
Show details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)
Hide details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)

Editor(s)
Kevin Chang
Created
02/21/2007 12:08:17 PM
Revisions
4.
3.
2.
1.
0.
Editor(s)
Christine Kiesel
Regina Kraemer
Christine Kiesel
Kevin Chang
Kevin Chang
Edit Dates
30.04.2007 18:26:08
04/16/2007 10:03:37 AM
10.03.2007 07:58:18
03/01/2007 05:12:38 PM
02/21/2007 12:08:17 PM
Show details for Attachment SectionAttachment Section
Hide details for Attachment SectionAttachment Section

View attachments here:


File Attachment Icon
census-soda.ps