title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Approximate Cluster Recovery from Noisy Labels

Designing algorithms for machine learning problems targeting beyond worst-case analysis and, in particular, analyzing the effect of side-information on the complexity of such problems is a very important line of research with many practical applications. In this paper we study the classic k-means clustering problem in the presence of noisy labels. In this problem, in addition to a set of points and parameter $k$, we receive cluster labels of each point generated by either an adversarial or a random perturbation of the optimal solution. Our main goal is to formally study the effect of this extra information on the complexity of the k-means problem. In particular, in the context of random perturbations, we give an efficient algorithm that finds a clustering of cost within a factor $1+o(1)$ of the optimum even when the label of each point is perturbed with a large probability (think 99%). In contrast, we show that the side-information with adversarial perturbations is as hard as the original problem even if only a small $\epsilon$ fraction of the labels are perturbed. We complement this negative result by giving a simple algorithm in the case when the adversary is only allowed to perturb an $\epsilon$ fraction of the labels per \emph{each cluster}.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

gamlath22a

0

Approximate Cluster Recovery from Noisy Labels

1463

1509

1463-1509

1463

false

Gamlath, Buddhima and Lattanzi, Silvio and Norouzi-Fard, Ashkan and Svensson, Ola

given	family
Buddhima	Gamlath

given	family
Silvio	Lattanzi

given	family
Ashkan	Norouzi-Fard

given	family
Ola	Svensson

2022-06-28

Proceedings of Thirty Fifth Conference on Learning Theory

178

inproceedings

date-parts

2022

6

28

https://proceedings.mlr.press/v178/gamlath22a/gamlath22a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-06-28-gamlath22a.md

2022-06-28-gamlath22a.md

Files

2022-06-28-gamlath22a.md

Latest commit

History

2022-06-28-gamlath22a.md

File metadata and controls