-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extremely slow Network and Disk IO on Windows agent compared to Ubuntu/Mac #3577
Comments
Hello, @jetersen. |
Well repro was super easy actually. https://github.com/jetersen/dotnet.restore.slow.github.action Locally on my machine the cache speeds up the restore to under 1 seconds. Starting dotnet restore also seem relatively slow when called through pwsh shell 😰 |
The scary part is that the slowness is consistent at least according to the dotnet restore timings. On bigger projects this time scale goes into minutes and we would expect that the cache would speed up the restore significantly. |
Well Linux and MacOS does not have this issue. Ubuntu-20.04 no cache 14s This is more along the lines of what I would expect. No improvement between each run on windows with cache... |
In log you can clearly see the timescale on windows not making sense but it is consistent.
|
@al-cheb Hope the repro is good enough, we were seeing the similar behavior in our private repo on our unit test project but also on some of our projects with more dependencies. We do restore our internal packages from an internal NuGet source but that would not affect caching. |
@jetersen, Looks like it doesn't work only with
|
Good thing is you now have an good repro case to fix it and hopefully it won't reappear in the next dotnet release 😅 |
Hi @jetersen, dotnet 6.0 (preview version) also works properly. Looks like this bug affects only the latest patch version of 5.0 |
@vsafonkin odd, I guess someone else already caught the regression 😅 |
@vsafonkin @al-cheb I think we can close it as you mentioned it only affects the latest Thank you for investigating the issue. |
This is still happening with the latest version of .NET (6.0.x) -- I have projects consistently taking multiple minutes to restore a minimal set of dependencies. |
I have noticed the same thing recently. |
Just saw a 1 minute restore for very basic test packages: |
My latest test shows that it's not just
The build and test steps are close enough, but all of the actions that require some sort of network transfer are extremely slow. Is this a known issue with Windows environments, and if so, can something be added to documentation somewhere to indicate that the huge performance hit is expected? |
@SteveDesmond-ca nice comparison, I can definitely attest to this based on our private repos. |
Ya, okay now the issue is no longer dotnet restore but networking is a significant issue in the test case I built: action: skip telemetry and other slow downs Even in the update dependencies action restoring new dependencies is slow on windows. So network seems to be a issue. Checking the output of the cache as it includes download speeds something is off: Ubuntu
Windows
Sometimes ubuntu seems a lot faster at download:
Could this be fixed tweaking by TcpAckFrequency and TcpNoDelay on the windows VM? Perhaps update the issue title: slow network transfer on Windows agent Perhaps worth reconsidering #4424 to have DotNet 6.0 installed as it is the latest LTS. |
I had cases when restore took a long time when nuget.org was not available to check the signatures. Setting env var NUGET_CERT_REVOCATION_MODE to offline solved that particular problem. Maybe it is the same issue? https://docs.microsoft.com/en-us/nuget/reference/errors-and-warnings/nu3028 |
@jetersen, we are still investigating it, actually I'm confused that downloading via curl has very similar performance. For example, powershell script (download ~200 Mb 200 times):
|
@jetersen , @SteveDesmond-ca , Could you check a restore step with params?
|
@al-cheb that does not resolve the fact that downloading .NET 6 is also slower. |
Ubuntu agents have slightly higher IOPS disk performance configuration. We use install-dotnet.ps1 script for installation provided by DotNet team . The |
@al-cheb jetersen/dotnet.restore.slow.github.action@f342429 does indeed help with NuGet restore. I used a different approach to what you mentioned. |
@jetersen, how we see this issue is not related to network configuration of the Windows runner. We have tested Windows image from Azure Marketplace and got the same network performance. Looks like the problem appears from .NET side because we see performance degradation with newer versions of .NET(for example restore on .NET 5.0 is slower then restore on .NET 3.1). Also there is an issue with precached Nuget packages placed on As @al-cheb mentions above Ubuntu agents have slightly higher IOPS disk performance configuration. |
I think Of course there is still the issue with |
Yes, but to install the .NET SDK, the https://docs.microsoft.com/en-us/dotnet/core/tools/dotnet-install-script |
Closed, because there is nothing we can do from our side. |
Created actions/setup-dotnet#260 😓 |
We have I/O perf issues as well, and it's not related to .NET. For example, when using the Windows: It took Windows 428 seconds to restore 1337MB. Linux: Although Linux had to restore only 1074MB, it took only 34 seconds. As you can see, Windows was actually downloading files from the cache faster than Windows. However, I suspect that what took the longest was extracting them: to us, it seems to be more of a disk I/O issue rather than a network one. This is one big example, but we see Windows agents being much slower in other things that perform disk I/O, such as pulling/pushing Docker images. |
@ItalyPaleAle windows and ubuntu runners are different in terms of disk allocation and that may explain the issue. |
@miketimofeev the performance different is very significant however. I don't know what disks are used, but Linux agents processed 1000MB in 32s and WIndows ones in 320s in our tests above, so 10x slower. (I am aware that it's not apples-to-apples as the files in the caches aren't exactly the same, but this difference should be significant regardless) |
This is very frustrating, the nuget downloading was slow on windows, even timing out sometimes, so to make it faster, and more reliable I wanted to use the cache, only to find out that the cache action is also very slow on windows. |
Unfortunately, this is still a problem as of July/2023. The |
I've got a pipeline that I run against Linux, Mac and Windows to test cross platform-ness. The Linux and Mac ones take a few minutes. The windows one can take 7, 8 or 9 minutes. |
Yeah, windows is still like 11 minutes slower then linux for us. |
I also see similar timings: My build times are overall significantly slower in Windows. A |
Description
Actually we are seeing the same behavior on GitHub actions running a shell command for
dotnet restore
takes a very long time on windows even when using actions/cache 😅Originally posted by @jetersen in #1733 (comment)
I cannot replicate locally where a restore with a full nuget cache it takes less than a second.
Area for Triage:
.NET Core
Question, Bug, or Feature?:
Bug
Virtual environments affected
Image version
Expected behavior
actions/cache should speed up dotnet restore to take a few seconds
Actual behavior
Dotnet restore even with actions/cache takes well over 30 seconds.
Repro steps
https://github.com/jetersen/dotnet.restore.slow.github.action
The text was updated successfully, but these errors were encountered: