MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Computational Methods for Comparison and Exploration of Event Sequences

Dr. Jefrey Lijffijt
Aalto University, Finland
Talk

Jefrey Lijffijt is a postdoctoral researcher at Aalto University, Finland. He obtained his doctoral degree from the same university in December 2013. His dissertation introduces and reviews methods for analysis of event sequences, the prime motivation being analysis of natural language corpora. His thesis received the “Best doctoral dissertation of 2013” award from the Aalto University School of Science. His research mainly focuses on mining interesting and surprising patterns in sequential data, transactional databases, and graphs, and more generally he is interested in pattern mining, text mining, data randomization and statistical significance testing methods. More info: http://users.ics.aalto.fi/lijffijt/
AG 5, RG1, MMCI  
AG Audience
English

Date, Time and Location

Wednesday, 12 February 2014
14:00
60 Minutes
E1 4
433
Saarbrücken

Abstract

Many types of data, e.g., natural language texts, biological sequences, or sensor data, contain sequential structure. Analysis of such sequential structure is interesting for various reasons, for example, to discover recurring patterns, to detect that data consists of several homogeneous parts, or to find parts that are surprising compared to the rest of the data. The main question addressed in my doctoral dissertation is how to identify local and global patterns in event sequences. In this talk, I will give a brief outline of some computational problems studied in my thesis, and review one of the problems in depth; we consider the problem of mining subsequences with surprising event counts, which can be used, for example, to find parts of a text where a word is surprisingly frequent. We introduce a method to find all fixed-length subsequences of a long data sequence where the count of an event is significantly different from what is expected. The main problem is that the considered subsequences are overlapping and thus dependent and the question arises how to efficiently compute what is expected. I will briefly present a case study where the method is applied to the novel “Pride and Prejudice” by Jane Austen.

Contact

Petra Schaaf
5000
--email hidden
passcode not visible
logged in users only

Petra Schaaf, 01/27/2014 10:56 -- Created document.