You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
p2p sync is essentially a tree of processing tasks called stages, operating concurrently and connected via SPSC channels of some capactiy. Adding metrics and tracing to stages will greatly simplify debugging and identification of bottlenecks.
What will (presumably) occur is that some stages will be the slow point, causing its input channel to block the system. Knowing which stages are slow will show where to add parallelisation and/or increasing the channel capacity.
My take on this
Disclaimer - this is just my opinion without having attempted this, you might come to a different conclusion.
We are interested in at least three pieces of information for each stage
Which stage is this?
Processing time
Channel fullness
(1) - Use a &'static str to identify each stage. We can add this to the Stage trait, but this fails to uniquely ID a stage if there are duplicates involved. One alternative is to add it as an additional input parameter to the pipe function. This works, though it would also be nice if one could include some tree-like ID that would allow a system diagram UI to be drawn - but this is completely unecessary, just nice to explain visually what's going on. One could manually assign these IDs within stage names, but it should also be possible to do this at compile time if one adds functionality to the channel type (to pass on this type info somehow). But this is overkill.
(2) - this can just be a simple timer inside the pipe function which measures the execution time of the Stage in each iteration. Only issue is that some stages occur after try_buffer calls which means they execute over a vector of items, making the processing times incomparable. It would be possible to account for this by creating a BufferedReceiver type, but now we're adding more "duplicate" types just so we can log a bit better. I would hesitate to do this until the sync framework has proven mature. We might need to add many such types. Or none at all. Or maybe its trivial to perform this with a wrapper type and deref..
(3) - A channel's "fullness" can be determined using the capacity and max_capacity methods.
I'm unsure about the trace level - probably debug? You might also want to select certain stages.
p2p sync is essentially a tree of processing tasks called stages, operating concurrently and connected via SPSC channels of some capactiy. Adding metrics and tracing to stages will greatly simplify debugging and identification of bottlenecks.
What will (presumably) occur is that some stages will be the slow point, causing its input channel to block the system. Knowing which stages are slow will show where to add parallelisation and/or increasing the channel capacity.
My take on this
Disclaimer - this is just my opinion without having attempted this, you might come to a different conclusion.
We are interested in at least three pieces of information for each stage
(1) - Use a
&'static str
to identify each stage. We can add this to theStage
trait, but this fails to uniquely ID a stage if there are duplicates involved. One alternative is to add it as an additional input parameter to thepipe
function. This works, though it would also be nice if one could include some tree-like ID that would allow a system diagram UI to be drawn - but this is completely unecessary, just nice to explain visually what's going on. One could manually assign these IDs within stage names, but it should also be possible to do this at compile time if one adds functionality to the channel type (to pass on this type info somehow). But this is overkill.(2) - this can just be a simple timer inside the
pipe
function which measures the execution time of theStage
in each iteration. Only issue is that some stages occur aftertry_buffer
calls which means they execute over a vector of items, making the processing times incomparable. It would be possible to account for this by creating aBufferedReceiver
type, but now we're adding more "duplicate" types just so we can log a bit better. I would hesitate to do this until the sync framework has proven mature. We might need to add many such types. Or none at all. Or maybe its trivial to perform this with a wrapper type and deref..(3) - A channel's "fullness" can be determined using the
capacity
andmax_capacity
methods.I'm unsure about the trace level - probably
debug
? You might also want to select certain stages.We should also create a template to display these stats, probably on three line-charts (one per information piece).
The text was updated successfully, but these errors were encountered: