Proceedings Article, Paper @InProceedings Beitrag in Tagungsband, Workshop

 Show entries of: this year (2020) | last year (2019) | two years ago (2018) | Notes URL
 Action: login to update Options: Library locked Goto entry point

 Author, Editor
 Author(s): Jez, Artur dblp
 Editor(s): Kulikov, Alexander Kuznetsov, Sergei Pevzner, Pavel dblp dblp dblp Not MPII Editor(s): Kulikov, Alexander Kuznetsov, Sergei Pevzner, Pavel
 BibTeX cite key*: Jez2014CPM

 Title, Booktitle
 Title*: A really Simple Approximation of Smallest Grammar Booktitle*: 25th Annual Symposium on Combinatorial Pattern Matching (CPM 2014)

 Event, URLs
 URL of the conference: http://cpm2014.hse.ru/ URL for downloading the paper: Event Address*: Moscow, Russia Language: English Event Date* (no longer used): Organization: Event Start Date: 16 June 2014 Event End Date: 18 June 2014

 Publisher
 Name*: Springer URL: www.springer.com Address*: Berlin Type: Extended abstract

 Vol, No, Year, pp.
 Series: Lecture Notes in Computer Science
 Volume: 8486 Number: Month: June Pages: 182-191 Year*: 2014 VG Wort Pages: ISBN/ISSN: Sequence Number: DOI: 10.1007/978-3-319-07566-2_19

 (LaTeX) Abstract: In this paper we present a \emph{really} simple linear-time algorithm constructing a context-free grammar of size $O(g \log (N/g))$ for the input string, where $N$ is the size of the input string and $g$ the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet $\Sigma$ of the input string can be identified with numbers from $\{1,\ldots , N^c \}$ for some constant $c$. Algorithms with such an approximation guarantee and running time are known, however all of them were non-trivial and their analyses were involved. The here presented algorithm computes the LZ77 factorisation and transforms it in phases to a grammar. In each phase it maintains an LZ77-like factorisation of the word with at most $\ell$ factors as well as additional $O(\ell)$ letters, where $\ell$ was the size of the original LZ77 factorisation. In one phase in a greedy way (by a left-to-right sweep and a help of the factorisation) we choose a set of pairs of consecutive letters to be replaced with new symbols, i.e. nonterminals of the constructed grammar. We choose at least 2/3 of the letters in the word and there are $O(\ell)$ many different pairs among them. Hence there are $O(\log N)$ phases, each of them introduces $O(\ell)$ nonterminals to a grammar. A more precise analysis yields a bound $O(\ell \log(N/\ell))$. As $\ell \leq g$, this yields the desired bound $O(g \log(N/g))$. Keywords: Grammar-based compression, Construction of the smallest grammar, SLP, compression Download Access Level: Internal

 Correlation
 MPG Unit: Max-Planck-Institut für Informatik MPG Subunit: Algorithms and Complexity Group Audience: experts only Appearance: MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort

BibTeX Entry:

@INPROCEEDINGS{Jez2014CPM,
AUTHOR = {Jez, Artur},
EDITOR = {Kulikov, Alexander and Kuznetsov, Sergei and Pevzner, Pavel},
TITLE = {A really Simple Approximation of Smallest Grammar},
BOOKTITLE = {25th Annual Symposium on Combinatorial Pattern Matching (CPM 2014)},
PUBLISHER = {Springer},
YEAR = {2014},
TYPE = {Extended abstract},
VOLUME = {8486},
PAGES = {182--191},
SERIES = {Lecture Notes in Computer Science},