Proceedings Article, Paper @InProceedings Beitrag in Tagungsband, Workshop

 Show entries of: this year (2019) | last year (2018) | two years ago (2017) | Notes URL
 Action: login to update Options: Goto entry point

 Author, Editor
 Author(s): Chang, Kevin dblp
 Editor(s):
 BibTeX cite key*: Chang2007

 Title, Booktitle
 Title*: Multiple pass streaming algorithms for learning mixtures of distributions in $R^d$ Booktitle*: Algorithmic Learning Theory, 18th International Conference, Proceedings

 Event, URLs
 URL of the conference: URL for downloading the paper: Event Address*: Sendai, Japan Language: English Event Date* (no longer used): Organization: Event Start Date: 1 October 2007 Event End Date: 4 October 2007

 Publisher
 Name*: Springer URL: http://www.springer.com Address*: Berlin Type: Extended Abstract

 Vol, No, Year, pp.
 Series: Lecture Notes in Artifical intelligence
 Volume: 4754 Number: Month: Pages: 16 Year*: 2007 VG Wort Pages: ISBN/ISSN: Sequence Number: DOI:

 Note: This is an extended abstract. Full version has been submitted to a journal. (LaTeX) Abstract: We present a multiple pass streaming algorithm for learning the density function of a mixture of $k$ uniform distributions over rectangles (cells) in $\reals^d$, for any $d>0$. Our learning model is: samples drawn according to the mixture are placed in {\it arbitrary order} in a data stream that may only be accessed sequentially by an algorithm with a very limited random access memory space. Our algorithm makes $2\ell+1$ passes, for any $\ell>0$, and requires memory at most $\tilde O(\epsilon^{-2/\ell}k^2d^4+(2k)^d)$. This exhibits a strong memory-space tradeoff: a few more passes significantly lowers its memory requirements, thus trading one of the two most important resources in streaming computation for the other. Chang and Kannan \cite{chang06} first considered this problem for $d=1, 2$. Our learning algorithm is especially appropriate for situations where massive data sets of samples are available, but practical computation with such large inputs requires very restricted models of computation. Keywords: streaming algorithms, massive data sets, clustering, statistics Download Access Level: Intranet

 Correlation
 MPG Unit: Max-Planck-Institut für Informatik MPG Subunit: Algorithms and Complexity Group MPG Subsubunit: Advanced Models of Computation Audience: experts only Appearance: MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort

BibTeX Entry:

@INPROCEEDINGS{Chang2007,
AUTHOR = {Chang, Kevin},
TITLE = {Multiple pass streaming algorithms for learning mixtures of distributions in $R^d$},
BOOKTITLE = {Algorithmic Learning Theory, 18th International Conference, Proceedings},
PUBLISHER = {Springer},
YEAR = {2007},
TYPE = {Extended Abstract},
VOLUME = {4754},
PAGES = {16},
SERIES = {Lecture Notes in Artifical intelligence},