MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Sampling algorithms and coresets for Lp regression

Petros Drineas
Rensselaer Polytechnic Institute, New York, USA
AG1 Mittagsseminar (own work)

Petros Drineas is an assistant professor in the computer science department of Rensselaer Polytechnic Institute. His main research interests are the design and analysis of randomized and approximation algorithms for linear algebraic problems, and their applications to information retrieval and data mining.
AG 1, AG 3, AG 5, RG2, AG 2, AG 4, RG1, SWS  
AG Audience
English

Date, Time and Location

Friday, 24 August 2007
13:30
45 Minutes
E1 4
023
Saarbrücken

Abstract

L2 regression, also known as the least squares approximation problem, and more generally Lp regression, are fundamental problems that have numerous applications in mathematics and statistical data analysis. Recall that the over-constrained Lp regression problem takes as input an n x d matrix A, where n > d, an n-vector b, and a number p, and it returns as output a d-vector x such that ||Ax-b||_p is minimized. Several randomized algorithms for this problem, each of which provides a relative-error approximation to the exact solution, are described. The first algorithm applies novel ``subspace sampling'' probabilities to randomly sample constraints and thus construct a coreset for the L2 regression problem. The second algorithm (devised by T. Sarlos) applies a random projection and uses a ``fast'' version of the Johnson-Lindenstrauss lemma to obtain an efficient approximation to the L2 regression problem. The third algorithm generalizes the concept of a ``well-conditioned'' basis and of ``subspace-preserving sampling'' to obtain efficiently a coreset for Lp regression, for all $p\in[1,\infty)$.

Contact

Kevin Chang
--email hidden
passcode not visible
logged in users only

Tags, Category, Keywords and additional notes

algorithms, massive data sets, data mining

Kevin Chang, 08/22/2007 14:14
Kevin Chang, 08/01/2007 18:56 -- Created document.