MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Clustering to Minimize Cluster-Aware Norm Objectives

Martin Herold
Max-Planck-Institut für Informatik - D1
AG1 Mittagsseminar (own work)
AG 1  
AG Audience
English

Date, Time and Location

Friday, 20 December 2024
13:00
30 Minutes
E1 4
Rotunda, D1
Saarbrücken

Abstract

We initiate the study of the following general clustering problem. We seek to partition a given set $P$ of data points into $k$ clusters by finding a set $X$ of $k$ centers and assigning each data point to one of the centers. The cost of a cluster, represented by a center $x\in X$, is a monotone, symmetric norm $f$ (called inner norm) of the vector of distances of points assigned to $x$. The goal is to minimize a norm $g$ (called outer norm) of the vector of cluster costs. This problem, which we call \NCCS{f}{g}, generalizes many fundamental clustering problems such as $k$-Center (i.e., $(\mathcal{L}_{\infty},\mathcal{L}_{\infty})$-Clustering), $k$-Median (i.e., $(\mathcal{L}_{1},\mathcal{L}_{1})$-Clustering), Min-Sum of Radii (i.e., $(\mathcal{L}_{\infty},\mathcal{L}_{1})$-Clustering), and Min-Load $k$-Clustering (i.e., $(\mathcal{L}_{1},\mathcal{L}_{\infty})$-Clustering). A recent line of

research (Byrka et al. [STOC'18], Chakrabarty, Swamy [ICALP'18, STOC'19], and Abbasi et al. [FOCS'23]) studies norm objectives that are oblivious to the cluster structure such as $k$-Median and $k$-Center. In contrast, our problem models cluster-aware objectives including Min-Sum of Radii and Min-Load $k$-Clustering.

Our main results are as follows. First, we design a constant-factor approximation algorithm for $(\textsf{top}_\ell,\mathcal{L}_{1})$-Clustering where the inner norm ($\textsf{top}_\ell$) sums over the $\ell$ largest distances. This unifies (up to constant factors) the best known results for $k$-Median and Min-Sum of Radii. Second, we design a constant-factor approximation\ for $(\mathcal{L}_{\infty},\textsf{Ord})$-Clustering where the outer
norm is a convex combination of $\textsf{top}_\ell$ norms (ordered weighted norm). This generalizes known results for $k$-Center and Min-Sum of Radii. Obtaining a constant-factor approximation for more general settings that include
$(\mathcal{L}_{1},\mathcal{L}_{\infty})$-Clustering (Min-Load $k$-Clustering) seems challenging because even an $o(k)$-approximation is unknown for this problem. We can still use our two main results to obtain first (although non-constant) approximations for these problems including general monotone, symmetric norms.

Our algorithm for $(\textsf{top}_\ell,\mathcal{L}_{1})$-Clustering relies on a reduction to a novel generalization of $k$-Median, which we call Ball $k$-Median. In this problem, we aim at selecting $k$ balls (rather than $k$ centers) and pay for connecting the points to these balls as well as for the (scaled) radii of the balls. To obtain a constant-factor approximation for this problem we unify various algorithmic techniques originally designed for the cluster-oblivious
$k$-Median objective (Jain and Vazirani [JACM 2001], Li and Svensson [STOC'13]) and for the cluster-aware MinßSum of Radii Objective (Charikar and Panigrahi [STOC'01] and Ahmadian and Swamy [ICALP'16]).

Contact

Nidhi Rathi
+49 681 9325 1134
--email hidden
passcode not visible

Nidhi Rathi, 12/19/2024 21:31 -- Created document.