Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching NuGet (Windows) Python installations: Doc guidance or option in GitHub Action? #1527

Open
smheidrich opened this issue Jun 18, 2023 · 0 comments

Comments

@smheidrich
Copy link
Contributor

smheidrich commented Jun 18, 2023

Description

Preamble

In general, I think a lot of build time could be saved if cibuildwheel had better builtin functionality to make caching easier or automatic, but before I create a generic ticket that doesn't go anywhere because the scope is too large, let me start with something specific that might turn out to be low-hanging fruit:

Feature request

I think it would be nice if either

  1. cibuildwheel's GitHub Action came with optional caching (via actions/cache) of the NuGet Python installations it performs on Windows, or
  2. there was more guidance in the documentation on how to implement such caching oneself, ideally as copy-pasteable as the normal example CI configs (or maybe even part of them?).

In either case, it would of course be even better to extend this to other CI services (I guess only (2) is applicable to them them though).

There is precedent for (1) in other popular actions, e.g. setup-python comes with automatic caching of dependencies.

Example of impact on build time

For small builds on GitHub Actions' Windows runners, cibuildwheel will spend a significant fraction of the total build time installing different Python versions from NuGet. Since cibuildwheel already has logic to avoid re-downloading available Python versions, it's not that difficult for users to cache them when using GitHub Actions. I tried this here for a dummy repo building the same sample project that cibuildwheel builds during its own tests. Comparing the runtimes to those without caching, we see that this shaves off around 4 minutes on average:

Run # Build time w/o cache in minutes Build time w/ cache in minutes
1 9.5 7.7
2 12.4 8.9
3 14.7 7.35
Average 12.2 8.0

Runs with and without cache were alternated to more or less rule out the difference stemming from one group being run at a time when the runners happened to perform better/worse than at the time when the other group was run.

This is despite GitHub Action's cache being infamously slow to restore on Windows (~2 minutes in the example above), so perhaps more could be gained for other CI services.

Other ideas (future)

There are many other avenues for better caching support or documentation: virtualenvs could be cached as well to hopefully shave even more off the build times above, caching Python installations on MacOS would be a bit more difficult since they use the system installer, but maybe it would be possible by allowing users to choose an alternative installer (Homebrew?), and of course there could be better support for allowing users to cache their own dependencies and additional build tools (which I've only accomplished via some crazy hacks so far). But there should be separate tickets for all of those (maybe an "overview ticket" on caching in general if you agree that this is something that should be worked on?).

The thing that makes caching Python installations specifically attractive is that these are completely managed by cibuildwheel and should be unaffected by anything the user's build logic does, so it should be fairly safe (in contrast to, say, caching virtualenvs).

Thoughts?

Build log

https://github.com/smheidrich/cibuildwheel-cache-nuget/actions/runs/5304286464/jobs/9603413251

CI config

https://github.com/smheidrich/cibuildwheel-cache-nuget/blob/d41d7835d558a3f28096b763396a651f448e6a11/.github/workflows/build.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant