Improve CI performance for Windows #2600

jchadwick-buf · 2023-11-17T17:31:21Z

Sometimes the existing workflow runs quickly - sometimes it runs incredibly slowly.

This seems to come down to a disk I/O issue. The cache tarball will get downloaded very quickly, but then it will take minutes to untar. It's unclear why. I suspect there is something in particular wrong with the home directory in the Windows runner. In addition, go test itself seems to have unreliable performance characteristics.

Merely just moving the Go cache and path into the GitHub Actions _temp dir seems to vastly improve consistency. I ran the workflow multiple times to collect some runs:

Attempt #1: 3m 2s, no cache
Attempt #2: 3m 1s, cached
Attempt #3: 2m 45s, cached
Attempt #4: 4m 1s, cached
Attempt #5: 2m 44s, cached
Attempt #6: 2m 45s, cached

The cache restore times are reliably counted in seconds. There is still some run-to-run variance, but pathological behavior seems to have disappeared, both for go test and cache restore. While this doesn't bring performance quite up to par with the other workflows, it does seem to help quite a bit.

I'm really hoping I didn't miss anything here. Any reason why I may have tricked myself into thinking I solved the problem? :)

.github/workflows/windows.yaml

doriable

This seems reasonable to me, just added a nit :) Thanks for looking into this!

.github/workflows/windows.yaml

jchadwick-buf · 2023-11-17T18:32:00Z

(Re-requesting your reviews to make sure I handled your comments satisfactory.)

pkwarren · 2023-11-17T18:34:09Z

.github/workflows/windows.yaml

@@ -7,23 +7,25 @@ jobs:
  test:
    env:
      DOWNLOAD_CACHE: 'd:\downloadcache'
+      GOPATH: 'd:\a\_temp\go\work'


I think what we're after is putting both the go build/test cache and mod cache in this directory. I think I'd rather we use GOMODCACHE instead of GOPATH here then (just in case anything else sneaks in under GOPATH).

Alright, sounds good. Since this basically involves breaking the cache again, I'm going to go ahead and give a shot at making the setup-go@v4 path work; these are the two directories that setup-go@v4 caches, so it ought to be at least similar.

pkwarren

Looks good - just one small suggestion.

jchadwick-buf · 2023-11-17T19:01:46Z

OK, now with setup-go@v4 caching. Hopefully this will be a lot more pleasing to people.

Performance so far:

Attempt #1: 4m 12s, uncached
Attempt #2: 3m 9s, uncached (go1.21.4 was released between these two runs, making the first cache not hit)
Attempt #3: 3m 8s, cached
Attempt #4: 2m 43s, cached
Attempt #5: 3m 12s, cached

OK, still looks good. Nice.

The one last thing would be to check to see if we really need the silly _temp dir. @pkwarren has been instrumental in finding underlying issues relating to this problem and found actions/setup-go#393 which suggests that it's not that the temp directory is fast, it's that C: is slow. So in theory, we can move the cache dirs to just D: and it will be fast still.

jchadwick-buf · 2023-11-17T19:32:47Z

And the verdict is in: no weird _temp directory needed.

For more information on what's going on, see this issue. Although it's not said explicitly, I have a hunch at what the problem might be. C: drive obviously needs to contain the operating system and basically all of the runner image defaults, whereas D: is just a fresh, new drive. Internally this is probably implemented as creating a disk from a snapshot. And when creating a disk from a snapshot in a VM, it's pretty typical for the performance to be slow. In fact, even the Azure documentation references this. It's not clear that this is exactly how GitHub Actions works internally, but it would suffice to say that the constraint that C: drive must contain all of the runner image contents probably makes it a lot slower due to needing copy-on-write behavior.

So now, and hopefully for the last time...

Attempt #1: 5m 15s, uncached
Attempt #2: 3m 43s, cached
Attempt #3: 2m 51s, cached
Attempt #4: 3m 40s, uncached (unknown reason; maybe we're near the cache limit)
Attempt #5: 2m 55s, cached

Phew. This one was really tricky and had a lot of unexpected twists and turns. Thanks @pkwarren for the help figuring things out, and I hope everyone feels I addressed their comments sufficiently. I'll leave this open for a few more minutes to make sure nobody else has anything to say.

Improve cache restore time for Windows

4d834ca

jchadwick-buf requested review from pkwarren and doriable November 17, 2023 17:31

pkwarren reviewed Nov 17, 2023

View reviewed changes

.github/workflows/windows.yaml Outdated Show resolved Hide resolved

doriable reviewed Nov 17, 2023

View reviewed changes

.github/workflows/windows.yaml Outdated Show resolved Hide resolved

jchadwick-buf added 2 commits November 17, 2023 13:08

Try using env instead of GITHUB_ENV script.

57c1601

Add comment regarding the performance kludge

997769b

jchadwick-buf requested review from pkwarren, doriable and emcfarlane November 17, 2023 18:31

pkwarren reviewed Nov 17, 2023

View reviewed changes

May as well try to use setup-go@v4 cache

3355953

pkwarren approved these changes Nov 17, 2023

View reviewed changes

One last cleanup

c00cb9c

pkwarren approved these changes Nov 17, 2023

View reviewed changes

jchadwick-buf merged commit 11089b9 into main Nov 17, 2023
7 checks passed

jchadwick-buf deleted the jchadwick/faster-windows-ci branch November 17, 2023 19:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve CI performance for Windows #2600

Improve CI performance for Windows #2600

jchadwick-buf commented Nov 17, 2023

doriable left a comment

jchadwick-buf commented Nov 17, 2023

pkwarren Nov 17, 2023

jchadwick-buf Nov 17, 2023

pkwarren left a comment

jchadwick-buf commented Nov 17, 2023

jchadwick-buf commented Nov 17, 2023 •

edited

Loading

Improve CI performance for Windows #2600

Improve CI performance for Windows #2600

Conversation

jchadwick-buf commented Nov 17, 2023

doriable left a comment

Choose a reason for hiding this comment

jchadwick-buf commented Nov 17, 2023

pkwarren Nov 17, 2023

Choose a reason for hiding this comment

jchadwick-buf Nov 17, 2023

Choose a reason for hiding this comment

pkwarren left a comment

Choose a reason for hiding this comment

jchadwick-buf commented Nov 17, 2023

jchadwick-buf commented Nov 17, 2023 • edited Loading

jchadwick-buf commented Nov 17, 2023 •

edited

Loading