Reducing memory footprint #213

sethaxen · 2024-10-23T20:35:18Z

Summary

Pathfinder stores nearly all intermediate computations for inspection purposes, which for high-dimensional targets can result in too high of a memory footprint. This issues proposes a non-breaking refactor that will significantly reduce this footprint.

Background

(Multi-path) Pathfinder runs in 2 phases:

Phase 1: In this phase runs L-BFGS, storing the trace of positions and gradients. Worst-case storage requirements are O(nruns * maxiters * dim)
Phase 2: computes the inverse Hessian approximation at each iteration, draws ndraws_elbo draws per iteration to estimate ELBO, draws ndraws_per_run draws for the ELBO-maximizing MvNormal approximation, and then stores ndraws draws. Worst-case storage requirements are O(nruns * maxiters * dim * (2 * history_length + ndraws_elbo) + nruns * ndraws_per_run * dim + ndraws * dim)

Proposal

These phases can be interleaved. In the optimization callback, we could store just the current state, as well as the ELBO-maximizing multivariate normal approximation. This would reduce worst-case storage requirements to O(nruns * dim * (2 * history_length + ndraws_elbo + ndraws_per_run) + ndraws * dim), effectively eliminating maxiters from each of the previous expressions. With default settings, this could potentially reduce the memory footprint by 1,000-fold.

I think it's possible to make these changes in a non-breaking way. The plan is to introduce a keyword argument save_trace=true, which could be switched to false to

avoid storing draws in ELBOEstimates,
avoid storing all fit_distributions, and
avoid storing OptimizationTraces.

In the future we may consider the breaking change of defaulting to save_trace=false.

Concretely, we would introduce internal structs with names like LBFGSState and PathfinderState and update these in-place within OptimizationCallback. The biggest changes would be to refactor many of the utility functions to mutate provided storage.

The text was updated successfully, but these errors were encountered:

sethaxen linked a pull request Oct 25, 2024 that will close this issue

Reduce memory footprint #218

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing memory footprint #213

Reducing memory footprint #213

sethaxen commented Oct 23, 2024

Reducing memory footprint #213

Reducing memory footprint #213

Comments

sethaxen commented Oct 23, 2024

Summary

Background

Proposal