Recovery guarantees for exemplar-based clustering

Abhinav Nellore, Rachel Ward

Research output: Contribution to journalArticlepeer-review

23 Scopus citations

Abstract

For a certain class of distributions, we prove that the linear programming relaxation of k-medoids clustering - a variant of k-means clustering where means are replaced by exemplars from within the dataset - distinguishes points drawn from nonoverlapping balls with high probability once the number of points drawn and the separation distance between any two balls are sufficiently large. Our results hold in the nontrivial regime where the separation distance is small enough that points drawn from different balls may be closer to each other than points drawn from the same ball; in this case, clustering by thresholding pairwise distances between points can fail. We also exhibit numerical evidence of high-probability recovery in a substantially more permissive regime.

Original languageEnglish (US)
Pages (from-to)165-180
Number of pages16
JournalInformation and Computation
Volume245
DOIs
StatePublished - Dec 1 2015
Externally publishedYes

Keywords

  • Exact recovery
  • Linear programming
  • Separated balls
  • k-Medoids

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Recovery guarantees for exemplar-based clustering'. Together they form a unique fingerprint.

Cite this