Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests fail intermittently with different exceptions #2563

Closed
gao-artur opened this issue Mar 14, 2024 · 12 comments
Closed

Tests fail intermittently with different exceptions #2563

gao-artur opened this issue Mar 14, 2024 · 12 comments

Comments

@gao-artur
Copy link

We have an infrastructure to run integration tests with 3rd party dependencies running in k8s pods. As part of the test init process, we start different pods (MySQL, RabbitMQ, MongoDB, etc) and then run our test suits against these pods. Sometimes, the initialization process fails with strange exceptions that, I believe, obscure the real reason for failure. For example, we observed such behavior in the past when pods were killed because of OOM. Here are some examples of these exceptions

Class Initialization method XXX.ClassInit threw exception. System.BadImageFormatException: Index not found. (0x80131124).

Initialization method XXX.TestInit threw exception. System.Security.SecurityException: Invalid assembly public key. (0x8013141E).

Class Initialization method XXX.ClassInit threw exception. System.BadImageFormatException: Could not load file or assembly '�#, Version=32002.1205.1024.18, Culture=neutral, PublicKeyToken=01c0628b004400c06293007701c1628b09883fe0628b004400e0629300770100638b00440000639300770104638b09c72020638b00440020639300770124638b09c72040638b00440040639300770160638b00440060639300770181638b09c720816313004400a1631300440041648b09c72060648b00'. Index not found. (0x80131124).

Assembly Initialization method XXX.AssemblyInit threw exception. System.InvalidOperationException: The DbContext of type '��.' cannot be pooled because it does not have a public constructor accepting a single parameter of type DbContextOptions or has more than one constructor.. Aborting test execution.

We executed tests with --blame-crash --blame-crash-dump-type full flags. But I wasn't able to debug the created dump. Let me know how I can send it to you if it's useful (~80MB).

Additional info:

.NET SDK v7.0.313

Microsoft.NET.Test.Sdk v17.6.3
MSTest.TestAdapter v3.1.1
MSTest.TestFramework v3.1.1

@Evangelink
Copy link
Member

You are mentioning .NET SDK 7, shall I assume that your test project is targeting .NET 7 or is it a netfx target? Would it be possible to have the diagnostic logs (VSTest) of both a working and failing build?

@gao-artur
Copy link
Author

All projects are targeting .NET 7.
Can I send you the logs via a private channel? I'd prefer not to share them publicly.

@Evangelink
Copy link
Member

The best is to report issues through developercommunity, I recommend looking at https://learn.microsoft.com/cpp/overview/how-to-report-a-problem-with-the-visual-cpp-toolset?view=msvc-170#to-create-a-problem-report-for-private-information that explains well how to add private information.

@Evangelink
Copy link
Member

Could you make some test with our runner too? It's possible that the behavior would be different since you would not be using VSTest platform.

@gao-artur
Copy link
Author

Sure. Will try next week.

@gao-artur
Copy link
Author

Hi. I'm currently busy with other tasks and can't make the test with a new runner, but I am very interested in finding the root cause for the failures. Please keep this issue open and I hope to provide more info during the next month.

@REscobar
Copy link

REscobar commented Jun 5, 2024

I'm having this same issue when running test during a Docker container build, it seems to be a race condition during the assemblies instrumentation because when it's run in a single core environment it completes successfully everytime.

I'm forcing the build to run in "single core" mode by limiting the number of cores available to wsl to 1

An additional exception that happens to me, at random, in addition to all of the above mentioned is:

#15 9.178 Data collector 'XPlat code coverage' message: [coverlet]Coverlet.Collector.Utilities.CoverletDataCollectorException: CoverletCoverageDataCollector: Failed to instrument modules
#15 9.178  ---> System.IO.IOException: The process cannot access the file '/app/build/XXXXX.Database.SqlServer.Migrations.pdb' because it is being used by another process.
#15 9.178    at Microsoft.Win32.SafeHandles.SafeFileHandle.Init(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize)
#15 9.178    at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize)
#15 9.178    at System.IO.FileSystem.CopyFile(String sourceFullPath, String destFullPath, Boolean overwrite)
#15 9.178    at System.IO.File.Copy(String sourceFileName, String destFileName, Boolean overwrite)
#15 9.178    at Coverlet.Core.Helpers.FileSystem.Copy(String sourceFileName, String destFileName, Boolean overwrite) in /_/src/coverlet.core/Helpers/FileSystem.cs:line 35
#15 9.178    at Coverlet.Core.Helpers.InstrumentationHelper.BackupOriginalModule(String module, String identifier) in /_/src/coverlet.core/Helpers/InstrumentationHelper.cs:line 250
#15 9.178    at Coverlet.Core.Coverage.PrepareModules() in /_/src/coverlet.core/Coverage.cs:line 130
#15 9.178    at Coverlet.Collector.DataCollection.CoverageWrapper.PrepareModules(Coverage coverage) in /_/src/coverlet.collector/DataCollection/CoverageWrapper.cs:line 71
#15 9.178    at Coverlet.Collector.DataCollection.CoverageManager.InstrumentModules() in /_/src/coverlet.collector/DataCollection/CoverageManager.cs:line 66
#15 9.178    --- End of inner exception stack trace ---
#15 9.178    at Coverlet.Collector.DataCollection.CoverageManager.InstrumentModules() in /_/src/coverlet.collector/DataCollection/CoverageManager.cs:line 70
#15 9.178    at Coverlet.Collector.DataCollection.CoverletCoverageCollector.OnSessionStart(Object sender, SessionStartEventArgs sessionStartEventArgs) in /_/src/coverlet.collector/DataCollection/CoverletCoverageCollector.cs:line 143.

All the projects are net 6

The versions we are using are:
mcr.microsoft.com/dotnet/sdk:6.0-bookworm-slim
MSTest.TestAdapter 3.4.0
MSTest.TestFramework 3.4.0
coverlet.collector 6.0.2

@Evangelink
Copy link
Member

@REscobar would you mind creating a bug on coverlet repo about the exception you have mentioned? From the stack trace, I don't see anything that would be coming from our side.

@REscobar
Copy link

Thanks for your response, I will create the issue and mention this one on it, but as I said in my comment, I'm also experimenting the exact same issues reported above including the coverlet one becuase I just so happen to be using it, if I were to use additional instruments those would fail in strange ways too.

Notice that the callstack mentions that the files are being held by another process, if there was some kind of race condition happening that would cause all of the above issues, like loading assembly files before they finish copying to the output directory would cause the loader to read incomplete metadata.

@Evangelink
Copy link
Member

Can you confirm your are using VSTest and not MSTest runner?

@REscobar
Copy link

This is the full command that I'm using dotnet test XXXXXX.sln -c Release -o /app/build --collect:"XPlat Code Coverage" --logger:"junit;LogFilePath=../../artifacts/{assembly}-test-result.xml;MethodFormat=Class;FailureBodyFormat=Verbose" and it's run during a docker build, this works fine if I run it in wsl, windows and also when running during the docker build when the available cores are set to 1 in wsl config, this fails if all the cores are available and also fails when building an image as part of GitLab CI

@REscobar
Copy link

REscobar commented Jul 19, 2024

My issue seems to be caused by coverlet, I "solved" my case by limiting the cpu count during test using -maxcpucount:1 switch (link to docs), my builds seems to now have a 100% success rate after adding the switch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants