Campus Event Calendar: Tomasz Kociumaka (02/07/2024 in E1 5/002)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

New Tools for Text Indexing and Beyond: Substring Complexity and String Synchronizing Sets

Tomasz Kociumaka

Max-Planck-Institut für Informatik - D1

Joint Lecture Series

AG 1, AG 2, AG 3, INET, AG 4, AG 5, D6, SWS, RG1, MMCI

Public Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Wednesday, 7 February 2024

12:15

60 Minutes

E1 5

002

Saarbrücken

Abstract

In recent decades, the volume of sequential data processed in genomics and other disciplines has exploded. This trend escalates the challenge of designing text indexes – concise data structures that allow answering various queries efficiently. For example, pattern-matching queries ask if the text (a long string modeling the dataset) contains an occurrence of a given pattern (a short string provided at query time). The suffix array is a classic index for pattern matching and many other queries, but modern applications demand space-efficient alternatives like the FM index. Still, even these data structures struggle with massive highly repetitive datasets, such as human genome collections. This scenario requires indexes whose size is comparable to the best text compression methods.

The talk will introduce two tools originating from my text indexing research. Substring complexity is a well-behaved measure of text repetitiveness that aids in comparing the performance of compression algorithms. String synchronizing sets enable consistent selection of small subsets of text substrings. Recently, these tools became the core of the asymptotically smallest compressed text index with suffix-array functionality. Furthermore, they find applications beyond text indexing: in combinatorics on words, quantum string algorithms, and more.

Contact

Jennifer Müller

+49 681 9325 2900

--email hidden

Virtual Meeting Details

System used:

Zoom

Meeting URL:

https://zoom.us/j/99715655535

Meeting ID:

997 1565 5535

Passcode:

passcode not visible

Code Visible for:

logged in users only

Jennifer Müller, 10/04/2023 10:44 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis