-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libflux: add flux_send_new() #5499
Conversation
Problem: some older rpc and response handling code does not conform to modern project norms. Make the following changes - break long compound conditinals to one per line - break long parameter lists to one per line
I corrected problem where I was updating a message reference count outside of the lock when it should have been inside. Dropping the WIP since I think any work in the broker is going to need to be its own PR. |
Looks like caliper is always enabled in CI. When caliper is enabled, To address this I just moved the call to |
Problem: it's a common pattern to create a message, send it with with flux_send(), then destroy it; however, there is no interface to allow flux_send() to just take over the message and thereby avoid a costly message copy if/when possible. Add flux_send_new() which accepts a non-const flux_msg_t **. The reference count must be 1, and on success *msg is set to NULL, so there should be no possibility of the message being reused. This just calls flux_send() internally if the connector does not implement op->send_new(), or if other optional things in the environment make calling op->send_new() unworkable (such as RPC tracking which wants to temporarily hold copy of each request until a response is received). Move the call to profiling_msg_snapshot() to before the send rather than after so send_new() doesn't have to fall back to send() when flux is built with caliper profiling.
Problem: several well traveled code paths in libflux call flux_send() then immediately destroy the message, thereby not communicating that the message need not be copied before modification. Call flux_send_new() in rpc, response, and logging code.
Oh, I was wondering why the function always fell back to |
The call to |
Codecov Report
@@ Coverage Diff @@
## master #5499 +/- ##
==========================================
- Coverage 83.69% 83.66% -0.03%
==========================================
Files 484 484
Lines 81561 81537 -24
==========================================
- Hits 68261 68218 -43
- Misses 13300 13319 +19
|
Ah, so the connector implementation of |
No the |
Ok, thanks! That makes sense now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Problem: the interthread connector plugin does not implement the send_new() operation, which can reduce message copying. Add a send_new() operation.
Problem: there is no unit test coverage for flux_send_new(). Add some tests to the handle unit test.
Problem: the broker module internally defines a message credential that is no longer used, now that the interthread connector does that. Drop it.
Problem: there is no man page for flux_send_new(). Add it to flux_send.rst and configure a stub to be produced for flux_send_new(). The previous synopsis function prototypes were not within a literal block and had to be escaped. Fix that while adding the new function.
Thanks! I'll set MWP. |
Got a couple of test failures in the el8,coverage builder - likely not related to this PR?
and nothing else really to go on in the test logs. I'll retry. |
Problem: it's a common pattern to create a message, send it with with
flux_send()
, then destroy it; however, there is no interface to allow the message ownership to be transferred rather than copied.Add
flux_send_new()
which accepts a non-constflux_msg_t **
and implement it in the interthread connector. Now messages can pass from one end to the other without costly duplication.Use this function in the common
flux_rpc()
andflux_respond()
family of functions.This is marked WIP for now while I consider whether it's possible to use this in the broker along the message path. It would be neat if rpcs between broker modules could transit the broker without copying!
Even just this change does seem to have a positive impact on job throughput.