-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Getting into a rare and unrecoverable state after 5.0.21 #2526
Comments
@hosadoya You can downgrade to 5.0.17 |
5.0.21 / .net core 7. This happens also to us about twice a week in a usage where many concurrent tasks CRUD the LiteDB database. Downgrading is not an option as 5.0.21 fixes a (for us) common occurrence of "empty page must be defined as empty type" cases. I regretfully do not have the issue reproducible - the cause seems random, but it seems to be more prevalent when inserting some data into an audit-log table in a background task. I have not seen the first occurrence of the "This transaction are invalid state" - but I am not 100% certain it did not occur. I do however see a lot of the "pages in memory store must be non-shared" starting all of a sudden. When this happens the entire LiteDB connection becomes unusable, first only getting "pages in memory store must be non-shared" and eventually getting nothing but errors complaining about "Maximum number of transactions reached Thankfully no data seems to be corrupted and restarting the app fixes the issue. Since there seems to be no solution yet I'm thinking about detecting this particular issue and destroying+recreating the LiteDB connection as temporary work around - but would like to urge the devs of LiteDB to take this one seriously; Is there any other suggestion other than downgrading perhaps? |
I had this issue in my application as well. before the error started, I changed the way my worker threads were getting the Database instance. in some places - there was still a 'using' pattern for the Database instance The "This transaction are invalid state" exception did not point me in that direction, But I did do some debugging and removed some worker threads, and all of the sudden I got the more informative 'Object disposed' exception. After fixing the issue - by not disposing my shared database instance - the original "This transaction are invalid state" exception did not return. have a look at your app and see if anybody is disposing a shared database instance in one of your worker threads. |
Thanks for your two cents - regretfully I'm already doing that (sharing a single instance) - I have not figured out the exact cause, but something is off. In the mean time I have been on a hunting spree and have settled on materializing all loops and never re-using a GetCollection after an await when doing async things which appears to have been a "solution" for us (it happens less frequently.) Regretfully there still seems to be a fundamental issue in LiteDB5 when using tasks. The read-lock mechanism uses the ManagedThreadID as key, which in the async/await world can be the same for multiple Task contexts - this is where sometimes things go bad. Extremely hard to solve because our app consists of a bunch of API/Controllers and heavily relies on async/await - I have tried to fix this in a fork of litedb but gave up (I got the problem fixed by also taking task context id into account and implementing an alterantive to ReaderWriterLockSlim to use Semaphores - with almost all tests passing - but seemed to get random failing tests when executing tests in parallel. So I must have missed something; I'm not familiar enough with LiteDB's code base and time constraints led me to give up for now; Note: there even is a test case in (disabled) in LiteDB's code base referencing Parallel.Foreach which also seems a root cause of these "random" occurences where the entire LiteDB instance goed brrr) Currently I was forced to do some lame try/catch and look for specific errors and destroy/recreate the LiteDB instance which at least mitigates this issue a bit for us. |
Version
Which LiteDB version/OS/.NET framework version are you using. (REQUIRED)
5.0.21/Windows/.net 6
Describe the bug
A clear and concise description of what the bug is.
After updating the NuGet to 5.0.21 version, getting randomly into an unrecoverable state. Here are some of the logs in sequence:
The very first and only unique error:
After this getting 94 of those errors from different threads:
After that getting thousands of those errors:
This already happened 3 times. The only way to recover was to restart the process.
Code to Reproduce
Write a small snippet to isolate your bug and could be possible to our team test. (REQUIRED)
This seems to be related to the recent OnDispose change as it is where the issue starts. No repro as this is very random error happening few times per week. All we got is detail logs shown above.
Expected behavior
A clear and concise description of what you expected to happen.
No error/corruption.
Screenshots/Stacktrace
If applicable, add screenshots/stacktrace
NA
Additional context
Add any other context about the problem here.
App is running many (max 20) concurrent tasks in parallel which CRUD into the LiteDB in varying ways.
The text was updated successfully, but these errors were encountered: