Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:A really simple approximation of a smallest grammar
Speaker:Artur Jeż
coming from:Max-Planck-Institut für Informatik - D1
Speakers Bio:
Event Type:AG1 Mittagsseminar (own work)
Visibility:D1, D2, D3, D4, D5, RG1, SWS, MMCI
We use this to send out email in the morning.
Level:AG Audience
Date, Time and Location
Date:Thursday, 10 April 2014
Duration:30 Minutes
Building:E1 4
In this talk I will present a really simple linear-time algorithm constructing a context-free grammar of size O(g log (N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet \Sigma of the input string can be identified with numbers from {1,... , N^c } for some constant c. Algorithms with such an approximation guarantee and running time are known, however all of them were non-trivial and their analyses were involved. The here presented algorithm computes the LZ77 factorisation and transforms it in phases to a grammar. In each phase it maintains an LZ77-like factorisation of the word with at most l factors as well as additional O(l) letters, where l was the size of the original LZ77 factorisation. In one phase in a greedy way (by a left-to-right sweep and a help of the factorisation) we choose a set of pairs of consecutive letters to be replaced with new symbols, i.e. nonterminals of the constructed grammar. We choose at least 2/3 of the letters in the word and there are O(l) many different pairs among them. Hence there are O(log N) phases, each of them introduces O(l) nonterminals to a grammar. A more precise analysis yields a bound O(l log(N/l)). As l \leq g, this yields the desired bound O(g log(N/g)).
Name(s):Artur Jez
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Keywords:Grammar-based compression; Construction of the smallest grammar; SLP; compression
Attachments, File(s):
  • Artur Jez, 03/27/2014 01:12 PM -- Created document.