You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The OnlineAverager model for streaming server-side online averaging has been accelerated by using all tensor operations, but this comes at the expense of the first few predictions being incorrect. This should be fixed either by
Implementing an update_idx scalar state that keeps track of the number of updates that have been performed, and scaling the normalization during averaging accordingly.
Documenting this drawback in all relevant places (including in EnsembleModel.add_streaming_output) so users can be aware and rescale outputs themselves.
In general, these predictions will be thrown away anyway since in the context of streaming input, they represent predictions on the past, but it's still worth making people aware of.
The OnlineAverager model itself also suffers from some lack of generality that might be worth addressing. In particular, num_channels and batch_size could both be inferred at inference time with minor adjustments to the code at the expense of some performance. Since this implementation is meant just to be used for Triton streaming output, performance is more important than generality. But it could be worth implementing something like this in the ml4gw library that fixes these problems.
The text was updated successfully, but these errors were encountered:
The
OnlineAverager
model for streaming server-side online averaging has been accelerated by using all tensor operations, but this comes at the expense of the first few predictions being incorrect. This should be fixed either byupdate_idx
scalar state that keeps track of the number of updates that have been performed, and scaling the normalization during averaging accordingly.EnsembleModel.add_streaming_output
) so users can be aware and rescale outputs themselves.In general, these predictions will be thrown away anyway since in the context of streaming input, they represent predictions on the past, but it's still worth making people aware of.
The
OnlineAverager
model itself also suffers from some lack of generality that might be worth addressing. In particular,num_channels
andbatch_size
could both be inferred at inference time with minor adjustments to the code at the expense of some performance. Since this implementation is meant just to be used for Triton streaming output, performance is more important than generality. But it could be worth implementing something like this in theml4gw
library that fixes these problems.The text was updated successfully, but these errors were encountered: