Left: rates between exit time measurements and our analytical formulae.
Right: comparison between simulations, ODE integration and SDE integrations, all starting from the same initial conditions.
This study explores the sample complexity for two-layer neural networks to learn a single-index target function under Stochastic Gradient Descent (SGD), focusing on the challenging regime where many flat directions are present at initialization. It is well-established that in this scenario
committee_learning/
: Python package containing all the code both for simulation and ODEs integration.how_to_simulate.ipynb
: notebook with an example of SGD dynamics simulation, and ODE & SDE integration.how_to_measure_exit_time.ipynb
: notebook with an example of measure of exit time.computation-database/
: folder for previously generated data.mathematica/
: Mathematica scripts for computing the explicit ODEs.
# Clone the repo (with submodules!)
git clone --recurse-submodules https://github.com/IdePHICS/EscapingMediocrity
cd EscapingMediocrity/
# Install Python requirements
pip install -r requirements
# Install committee_learning package (it requires g++)
pip install -e committee_learning/
Luca Arnaboldi, Florent Krzakala, Bruno Loureiro, Ludovic Stephan Escaping mediocrity: how two-layer networks learn hard single-index models with SGD, 2023 https://arxiv.org/abs/2305.18502.