MPI-INF Logo
Campus Event Calendar

Event Entry

New for: D3, D4

What and Who

Verbumculus and the Detection of Unusual Substrings

Alberto Apostolico
Padova/Purdue University
AG1 Mittagsseminar
AG 1, AG 2, AG 3, AG 4  
AG Audience
English

Date, Time and Location

Friday, 28 May 99
13:30
-- Not specified --
MPI
024
Saarbrücken

Abstract

Verbumculus and the Detection of Unusual Substrings

Alberto Apostolico
Purdue University and Universita' di Padova

The identification of strings that are, by some measure, redundant or rare
in the context of larger sequences is variously pursued in order to
compress data, unveil structure, infer minimal or compact descriptions,
extract and classify features, etc. In the emerging field of bio-sequence
analysis, unusually frequent or rare strings have been variously implicated
in biological functions and mechanisms. In these domains,
tables for storing the number of occurrences in a string of substrings
of (or up to) a given length need to be computed, stored, and tested against
expectations. This talk concentrates on the computational cost involved
in these tasks within some basic probabilistic models.
We discuss algorithms and a software tool called Verbumculus for detecting
and displaying all substrings of a given sequence
of $n$ symbols that deviate from expected frequency. The main theoretical
result is that under several accepted probabilistic models
the search for such over- or underrepresented words may be restricted
to $O(n)$ candidates.

Joint work with M. Bock, S. Lonardi, X.Xu

Contact

Roberto Solis-Oba
9325 116
--email hidden
passcode not visible
logged in users only