-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poller.remove socket disposed fix #835
base: master
Are you sure you want to change the base?
Conversation
I think this is a breaking change, as you are changing a return type. |
I didn't realize Run was public. BUT, while technically breaking change from the standpoint of the public API changed, any code written to a void return type will be unaffected, since that code isn't expecting anything, there won't be a conflict in the return type. This is a much better option than leaving a thread race bug in, IMO. It is also much better practice to return Task from async methods to avoid this very scenario in the first place, and the caller ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for working to improve this. Asynchrony is hard to get right, especially when retrofitting it to a synchronous API such as NetMQ's.
I am very concerned about the blocking waits here. These will definitely lead to deadlocks in user code. I feel like a better approach would be to introduce new asynchronous overloads, and potentially obsolete the current synchronous ones.
I think you might find this article interesting as it covers several things I've highlighted in review comments.
any code written to a void return type will be unaffected
It's worth distinguishing between source compatibility (which you're referring to) and binary compatibility (which a lot of people expect). An example of where this code change could break someone's application at run time is in a dependency graph where one package references NetMQ version N, and another references NetMQ version N+1. Only one version can exist at run time, meaning one of those two packages would experience MissingMethodException
. The end user would be confused and unable to fix the problem without patching the library packages they're using, which is a hard place to be. When writing public library code like NetMQ you must think in terms of binary compatibility, not source compatibility.
It is also much better practice to return Task from async methods
Agree 100%. For backwards compatibility and correctness, I think we have to leave the current methods as-is and introduce better behaved overloads that work as expected. Also, the .NET convention is that asynchronous methods end in Async
, so changing the existing non-Async
methods to return Task
is not idiomatic and may confuse people. NetMQPoller.RunAsync
violates this (which I think is my fault, sorry).
action(); | ||
|
||
t = FromResult<object>(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than instantiate a new completed task for each invocation, cache an instance in a CompletedTask
field and reuse it. You can do that in an internal utility class if you like, maybe called Tasks
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This same pattern is also in existing overloaded method ContainsAsync
. Also, I'm not convinced it is a good idea to share TaskCompletionSource
instances among (potentially) multiple threads...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A completed task is safe to share. Use Task.CompletedTask
for TFMs that support it, and cache your own (from a TaskCompletionSource<>
) for remaining TFMs.
The benefit of doing so is that this code path will not perform any heap allocations.
I haven't forgotten about this PR. Been busy with holidays and getting back to work. I will try to address the concerns raised here this week and issue an update. |
OK, so I pushed up several new commits to (hopefully) address the breaking API concerns, while still fixing the sync issues by adding |
/// <summary> | ||
/// Run an action on the Poller thread | ||
/// </summary> | ||
/// <param name="action">The action to run</param> | ||
[Obsolete("Queues the action on the poller's thread, but provides no sync mechanism. Please use RemoveAsync() to remove sockets or timers")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message needs reviewing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should force people to use the new async APIs. What do you think @somdoron?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't force anyone to use the new API's, it just forces them to opt-out (with compiler ignores for release build).
It is strongly suggesting to use the new API's, but that is much better than the status-quo, IMHO: can't dispose sockets at all, just have to rely on GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a sticking point? It can easily be removed, but is not my preference... But need clear direction from NetMq team as to the final yay or nay.
What is the preferred method to communicate this issue and change to application coders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drewnoakes So, I should interpret "I'm not sure..." as "I'm not approving this until it is removed" ?
Is moving this text to the summary block acceptable, or is this type of information put in the documentation web site only?
/// <summary> | ||
/// Run an action on the Poller thread | ||
/// </summary> | ||
/// <param name="action">The action to run</param> | ||
[Obsolete("Queues the action on the poller's thread, but provides no sync mechanism. Please use RemoveAsync() to remove sockets or timers")] | ||
public void Run([NotNull] Action action) | ||
{ | ||
if (!IsRunning || CanExecuteTaskInline) | ||
action(); | ||
else | ||
new Task(action).Start(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should use Task.Factory.StartNew
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is identical to the code in the original Run(Action)
method. See here.
Using Task.Factory.StartNew
requires several additional parameters in addition to specifying the scheduler:
Certainly possible, but I'm uncertain what the TaskCreationOptions
should be here, nor does the public API supply a CancellationToken
.
Adding a cancellationToken
parameter would break the existing API, and defining one internally requires creating it even though it won't be used to stop the task, unless an arbitrary timeout is assigned.
**updated to reference the original/master branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drewnoakes I need further clarification here. This request seems to require creating an API change? The current Task(action).Start(this)
ensures that the task is run on the poller thread, so I don't see an advantage to using Task.Factory.StartNew
.
Is there a compatibility issue with the current impl? If so, how should we handle the extra parameters? Creating and passing a CancellationToken
that is never used seems hacky.
@@ -225,7 +263,7 @@ public void Add([NotNull] NetMQTimer timer) | |||
throw new ArgumentNullException(nameof(timer)); | |||
CheckDisposed(); | |||
|
|||
Run(() => m_timers.Add(timer)); | |||
RunAsync(() => m_timers.Add(timer)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness, shouldn't we have AddAsync
methods that return tasks? These sync methods could call into them and discard the Task
.
Is there a reason you added RemoveAsync
but not AddAsync
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't run into any problems with the current add implementations...
BUT since you ask, it is probably possible to do
poller.Add(socket);
await poller.RemoveAsync(socket);
socket.Dispose();
and still have issues? Maybe? It is possibly a race condition?
For some reason, I didn't get any notifications about the comments. I'll look into them over the next few days. |
Codecov Report
@@ Coverage Diff @@
## master #835 +/- ##
==========================================
+ Coverage 67.18% 67.65% +0.47%
==========================================
Files 123 124 +1
Lines 8000 8030 +30
Branches 1311 1314 +3
==========================================
+ Hits 5375 5433 +58
+ Misses 2096 2075 -21
+ Partials 529 522 -7
|
OK, so does this PR have a chance, or are you guys ignoring it. If the latter, I'll just fork and move on with my life. |
Any update to this branch and issue? Issue is closed but branch not merged? Thx |
A bunch of the feedback I left was not addressed. Even still, there's a non-trivial amount of work in reviewing this change for correctness and pandemic life isn't giving me much free time. |
Thanks for the feedback Drew. Good to know. :) Stay save, |
Is the only remaining issue the |
@jasells for me at least, all unresolved comments are still issues for me. In each case my original comment stands. |
This issue has been automatically marked as stale because it has not had activity for 365 days. It will be closed if no further activity occurs within 56 days. Thank you for your contributions. |
This should fix #834 and a few other similar potential bugs in other versions of
Remove()
, andRemoveAndDispose()
See a few other comments on a couple of the tests and about throwing
ArgumentException
fromRemove()
when socket is disposed. Those should be left to a future PR, though.