Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock in web sync agent #1282

Open
radulacatus opened this issue Dec 5, 2024 · 5 comments
Open

Deadlock in web sync agent #1282

radulacatus opened this issue Dec 5, 2024 · 5 comments

Comments

@radulacatus
Copy link

Setup:

  • DownloadOnly using web orchestrator
  • Server using SqlServer database:
    • Dotmim.Sync.Core 1.0.2
    • Dotmim.Sync.SqlServer.ChangeTracking 1.0.2
    • Dotmim.Sync.Web.Server 1.0.2
  • Client using SqlServer database:
    • Dotmim.Sync.Core 1.0.2
    • Dotmim.Sync.Sqlite 1.0.2
    • Dotmim.Sync.SqlServer.ChangeTracking 1.0.2
    • Dotmim.Sync.Web.Client 1.0.0

Our client runs in an on premise network and we need to be resilient to internet downtime.
Calls to SynchronizeAsync are executed sequentially per scope inside a scheduled message handler.
The message handle has a timeout and cancels the cancellation token source if the sync call takes too long.

var progress = new SynchronousProgress<ProgressArgs>(args =>
logger.LogDebug($"{args.ProgressPercentage:p}:  \t[{args.Source[..Math.Min(4, args.Source.Length)]}] {args.TypeName}: {args.Message}"));
var res = await syncAgent.SynchronizeAsync(scopeName, null, syncType, null, cancellationToken, progress);

Most of the time everything works as expected. Even, cancelling the sync and retrying with a backoff.
Sometimes, the progress get's stuck at 0%, cancelling the token doesn't result in an interruption and the sync can only be unlocked by restarting the process.
This looks like a deadlock in the SyncAgent when using WebRemoteOrchestrator.
The same process does not hang when we used LocalOrchestrator, but this is not an option for us anymore.
We could not reproduce this issue in a POC, so maybe it's also related to the bigger volume of data in our production environment.

We noticed that SynchronousProgress is using SynchronizationContext.
Could it help if we don't use it anymore?
Do you have any other suggestions?

@Mimetis
Copy link
Owner

Mimetis commented Dec 5, 2024

Dont use SynchronousProgress if it's not in a console application.
Prefer use your own IProgress object

@radulacatus
Copy link
Author

We tried it out. We used our own implementation of IProgress, both serverside and clientside.
The issue still reproduces.

@Mimetis
Copy link
Owner

Mimetis commented Dec 18, 2024

Okay, can you share a simple application, like a github repository, where you can share a simple application reproducing the error ?
I see that this error is not happening every time, so I guess it can be difficult to reproduce.

The idea is to have something I can work on.

See an example where a github repo has been shared with an sql script to create the environment : #1286 (comment)

@radulacatus
Copy link
Author

I created a repo HERE.
Unfortunately with a POC like this i was not able to reproduce the issue.

@BogdanGeorge
Copy link

Another thing maybe worth mentioning, we create a singleton SyncAgent when we initialize our hosting service.
Over the entire lifecycle of the service we do multiple syncronizations and all of them are using the same SyncAgent with the same WebRemoteOrchestrator and the same HttpClient used by the web orchestrator since there is no way to provide a HttpClientFactory.
Are the SyncAgent and WebRemoteOrchestrator meant to be used as singletons, and reused over multiple syncronizations ? Could this be the reason leading to deadlock ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants