MPI-INF Logo
Campus Event Calendar

Event Entry

New for: D2, D3

What and Who

PhD Application Talk: Ranged maximal pairs queries using suffix trees and relational model

Ahmed El-Roby
PhD Application Talk
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI  
MPI Audience
English

Date, Time and Location

Monday, 25 July 2011
09:00
120 Minutes
E1 4
024
Saarbrücken

Abstract

Finding repetitive structures in string sequences is an important problem in various fields. In biological applications, it is clear why it might be of interest to find repetitive structures in genomic sequences for instance. Also in other fields, a problem in time series field can be transformed into a string mining problem. We particularly are interested in finding maximal pairs and maximal repeats in large strings. A maximal repeat is the substring that occurs more than once in input string where it cannot be extended to left or right in both occurrences and still be similar. The starting position of both occurrences is called a maximal pair. In this paper we are interested in finding all maximal pairs within a specific range of the input string. To adopt the ranged queries, we make modifications to original algorithm to find maximal pairs using suffix trees. We introduce multiple variants of the suffix tree (ranged suffix tree, bloom suffix tree, filter suffix tree) to facilitate finding all pairs in a query range. We also introduce a new model (relational model) that solves the same problem using relational algebra and database management systems features. The new relational model significantly outperforms original algorithm and proposed modifications. It also is carefully designed to exploit parallel environments. The relational model can solve a problem that takes one processor almost 12 hours to finish in less than one minute using 4096 processors on a Blue Gene/P architecture supercomputer.

Contact

IMPRS-CS
-1803
--email hidden
passcode not visible
logged in users only

Tags, Category, Keywords and additional notes

Please note: The talks will take place in random order!

Heike Przybyl, 07/21/2011 12:29
Heike Przybyl, 07/21/2011 12:10 -- Created document.