-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix deadlocks and improve no-timeout get() calls. #17
Conversation
@@ -574,6 +582,10 @@ def get(self, count=None, timeout=DEFAULT_TIMEOUT, as_string=False, **kw): | |||
if tmo > 0 and DEBUG != 0: | |||
logprint("got %s\n" % self.value.__str__()) | |||
|
|||
if "value" not in self.data: | |||
# If the timeout was None, there's a chance that the value hasn't arrived yet. | |||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptual question: what happens if we have self.data and self.value from a previous get? Will this block of code be bypassed and will the function return a stale value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Granted, this isn't a reason not to do this- but if it's true, it should be documented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also, I think this is how pyca behaves right now if you follow up a successful get with a new timeout=None get, so it's not something in-scope of this PR to modify)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you are right - if you do Pv.get(timeout=None)
after Pv.data
is already populated (via a previous get()
, or due to monitor callbacks), you'll get the most recently seen value, which might be quite old. This isn't new behavior to the PR.
I think this is actually the expected outcome with timeout=None
. The only element of surprise is that there's now a chance that you might get None
instead of a real value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good functionally- I added some minor whitespace-fixing suggestions + and documentation food for thought. I'm also going to ping @mcb64 because he has more experience with this code than I do as a sanity check.
Also: did you re-run the unit tests? I think we had some on this repo, though our continuous integration chain is definitely busted right now on this one. |
Co-authored-by: Zachary Lentz <ZLLentz@users.noreply.github.com>
Co-authored-by: Zachary Lentz <ZLLentz@users.noreply.github.com>
@ZLLentz I did run the unit tests, they all pass. |
Yeah, I think this looks good. CA itself has no knowledge of the GIL, so it should be safe to release it here (and restore the state after the CA call). pyca_monitor_handler tries to the GIL before touching python, which is why you are getting a deadlock here. (The danger would be if you release the GIL here and then the code proceeds to call python functions under the assumption that you already have the GIL. This is not the case though.) |
Excellent, let's merge it then! I'll also put out a new tag with this PR included. |
This PR fixes two totally separate issues (sorry).
capv.unsubscribe_channel()
andcapv.clear_channel()
could cause deadlocks if called while a monitor callback is in use, due to interactions with the GIL from callback threads (see https://epics.anl.gov/base/R7-0/4-docs/CAref.html#ca_clear_channel). I fixed this by releasing the GIL before callingca_clear_channel()
orca_clear_subscription()
. I believe this is reasonable, and it works in my testing, but I have to admit that I'm not 100% sure this is always a safe thing to do.Pv.get(timeout=None)
, you were almost guaranteed to crash because the value usually won't be available by the time the method returns. I added a check for that, and the method now returnsNone
if the value isn't ready yet. This makesPv.get(timeout=None)
much more useful if you want to get very large numbers of PVs at once: You can call it in a loop, expect to get back none, then collect all the values in a second pass.This is my totally unofficial contribution to the EPICS 2022 Codeathon :)