Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Algorithms for Shared-Memory Matrix Completion and Maximum Inner Product Search
Speaker:Christina Teflioudi
coming from:Max-Planck-Institut für Informatik - D5
Speakers Bio:
Event Type:Promotionskolloquium
Visibility:D1, D2, D3, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:Public Audience
Language:English
Date, Time and Location
Date:Thursday, 14 April 2016
Time:16:00
Duration:60 Minutes
Location:Saarbrücken
Building:E1 4
Room:0.24
Abstract
In this thesis, we propose efficient and scalable algorithms for shared-memory matrix factorization and maximum

inner product search. Matrix factorization is a popular tool in the data mining community due to its ability to
quantify the interactions between different types of entities. It typically maps the (potentially noisy) original
representations of the entities into a lower dimensional space, where the "true" structure of the dataset is revealed.
Inner products of the new vector representations are then used to measure the interactions between different
entities. The strongest of these interactions are usually of particular interest in applications and can be retrieved
by solving a maximum inner product search problem.

For large real-life  problem instances of matrix factorization and maximum inner product search, efficient and
scalable methods are necessary. We first study large-scale matrix factorization in a shared-memory setting and
we propose a cache-aware, parallel method that avoids fine-grained synchronization or locking. In more detail,
our approach partitions the initial large problem into small, cache-fitting sub-problems that can be solved
independently using stochastic gradient descent. Due to the low cache-miss rate and the absence of any locking
or synchronization, our method achieves superior performance in terms of speed (up to 60% faster) and scalability
than previously proposed techniques.

We then proceed with investigating the problem of maximum inner product search and design a cache-friendly
framework that allows for both exact and approximate search. Our approach reduces the original maximum inner
product search problem into a set of smaller cosine similarity search problems that can be solved using existing
cosine similarity search techniques or our novel algorithms tailored for use within our framework. Experimental
results show that our approach is multiple orders of magnitude faster than naive search, consistently faster than
alternative methods for exact search, and achieves better quality-speedup tradeoff (up to 3.9x faster for similar
recall levels) than state-of-the-art approximate techniques.

Contact
Name(s):Daniela Alessi
Phone:5000
EMail:--email address not disclosed on the web
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Note:
Attachments, File(s):

Created by:Daniela Alessi/MPI-INF, 04/11/2016 02:14 PMLast modified by:halma/MPII/DE, 09/24/2017 12:00 AM
  • Daniela Alessi, 04/11/2016 02:20 PM -- Created document.