MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Matrix Factorization over Dioids and its Applications in Data Mining

Sanjar Karaev
MMCI
Promotionskolloquium
AG 1, AG 2, AG 3, INET, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience
English

Date, Time and Location

Wednesday, 10 July 2019
10:30
60 Minutes
E1 5
029
Saarbrücken

Abstract

While classic matrix factorization methods, such as NMF and SVD, are known to be highly effective at finding latent patterns in data, they are limited by the underlying algebraic structure. In particular, it is often difficult to distinguish heavily overlapping patterns because they interfere with each other. To deal with this problem, we study matrix factorization over algebraic structures known as dioids, that are characterized by the idempotency of addition (a + a = a). Idempotency ensures that at every data point only a single pattern contributes, which makes it easier to distinguish them. In this thesis, we consider different types of dioids, that range from continuous (subtropical and tropical algebras) to discrete (Boolean algebra).

The Boolean algebra is, perhaps, the most well known dioid, and there exist Boolean matrix factorization algorithms that produce low reconstruction errors. In this thesis, however, a different objective function is used -- the description length of the data, which enables us to obtain more compact and highly interpretable results.
The tropical and subtropical algebras are much less known in the data mining field. While they find applications in areas such as job scheduling and discrete event systems, they are virtually unknown in the context of data analysis. We will use them to obtain idempotent nonnegative factorizations that are similar to NMF, but are better at separating most prominent features of the data.

Contact

Melanie Hans
5000
--email hidden
passcode not visible
logged in users only

Melanie Hans, 07/05/2019 14:15 -- Created document.