Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Matrix Factorization over Dioids and its Applications in Data Mining
Speaker:Sanjar Karaev
coming from:Max-Planck-Institut für Informatik - D5
Speakers Bio:
Event Type:Promotionskolloquium
Visibility:D1, D2, D3, INET, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:Public Audience
Date, Time and Location
Date:Wednesday, 10 July 2019
Duration:60 Minutes
Building:E1 5
While classic matrix factorization methods, such as NMF and SVD, are known to be highly effective at finding latent patterns in data, they are limited by the underlying algebraic structure. In particular, it is often difficult to distinguish heavily overlapping patterns because they interfere with each other. To deal with this problem, we study matrix factorization over algebraic structures known as dioids, that are characterized by the idempotency of addition (a + a = a). Idempotency ensures that at every data point only a single pattern contributes, which makes it easier to distinguish them. In this thesis, we consider different types of dioids, that range from continuous (subtropical and tropical algebras) to discrete (Boolean algebra).

The Boolean algebra is, perhaps, the most well known dioid, and there exist Boolean matrix factorization algorithms that produce low reconstruction errors. In this thesis, however, a different objective function is used -- the description length of the data, which enables us to obtain more compact and highly interpretable results.
The tropical and subtropical algebras are much less known in the data mining field. While they find applications in areas such as job scheduling and discrete event systems, they are virtually unknown in the context of data analysis. We will use them to obtain idempotent nonnegative factorizations that are similar to NMF, but are better at separating most prominent features of the data.

Name(s):Melanie Hans
EMail:--email address not disclosed on the web
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Attachments, File(s):
  • Melanie Hans, 07/05/2019 02:15 PM -- Created document.