Skip to content

Latest commit

 

History

History
41 lines (41 loc) · 1.39 KB

2019-06-25-ji19a.md

File metadata and controls

41 lines (41 loc) · 1.39 KB
abstract section title layout series id month tex_title firstpage lastpage page order cycles bibtex_author author date address publisher container-title volume genre issued pdf extras
Gradient descent, when applied to the task of logistic regression, outputs iterates which are biased to follow a unique ray defined by the data. The direction of this ray is the maximum margin predictor of a maximal linearly separable subset of the data; the gradient descent iterates converge to this ray in direction at the rate $\cO(\nicefrac{\ln\ln t }{\ln t})$. The ray does not pass through the origin in general, and its offset is the bounded global optimum of the risk over the remaining data; gradient descent recovers this offset at a rate $\cO(\nicefrac{(\ln t)^2}{\sqrt{t}})$.
contributed
The implicit bias of gradient descent on nonseparable data
inproceedings
Proceedings of Machine Learning Research
ji19a
0
The implicit bias of gradient descent on nonseparable data
1772
1798
1772-1798
1772
false
Ji, Ziwei and Telgarsky, Matus
given family
Ziwei
Ji
given family
Matus
Telgarsky
2019-06-25
PMLR
Proceedings of the Thirty-Second Conference on Learning Theory
99
inproceedings
date-parts
2019
6
25