New for: D1
terings of high-dimensional Euclidean inputs. It is easy to show that there is
no efficient implementation of these algorithms in high dimensional Euclidean
space since it implicitly requires to solve the closest pair problem, a notoriously
difficult problem. However, how fast can this algorithm be implemented if we
allow approximation? More precisely: this algorithm successively merges the
clusters that are at inducing the least sum-of-square error. We ask whether
one could obtain a significant running-time improvement if the algorithm can
merge gamma-approximate closest clusters (namely, clusters that are at distance sum-
of-square error at most gamma times the distance of the closest clusters). We show
that one can indeed take advantage of the relaxation and compute the approxi-
mate hierarchical clustering tree using gamma-approximate nearest neighbor queries.
This leads to an algorithm running in time Otilde(dn) + n^(1+O(gamma)) for d-dimensional
Euclidean space. We then provide experiments showing that this algorithm
performs as well as the non-approximate version for classic classification tasks
while achieving a significant speed-up.
--------------------
Join Zoom Meeting
Meeting ID: 527 278 8807
Note: for people outside D1 interested in listening to this talk, please contact Sándor Kisfaludi-Bak at skisfalu@mpi-inf.mpg.de for the password.