parallel-kalman-jax cpu associative scan is very slow #9

murphyk · 2022-03-22T03:05:23Z

You say

It is noteworthy that the parallel version will appear to be much slower due to a slow compilation in JAX. This could be improved by using a different implementation of the associative scan or by fixing the number of levels the way it is done in TensorFlow Probability.

What do you mean by 'fixing the number of levels'?

AdrienCorenflos · 2022-04-05T08:49:44Z

TBH there is not much to do to improve the CPU speed as Blelloch scan requires roughly 3 times the amount of serial work that a simple scan would require. However it's also worth noting that the parallel KF/KS will be slower even without this as it requires "inverting" matrices the size of the latent space, which is (often) bigger than the size of the observation space.

Looking back I think the comparison to TF was a small mistake on my end as the reason why TF has such a utility is for faster compilation in the case of varying length arrays: an impossibility in JAX.

A "real solution" would however be to lower the associative_scan operation to XLA directly (same as for other controlflow operations such as scan and while_loop) so as to bypass most of the compilation run. This would cost a lot of human effort though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallel-kalman-jax cpu associative scan is very slow #9

parallel-kalman-jax cpu associative scan is very slow #9

murphyk commented Mar 22, 2022

AdrienCorenflos commented Apr 5, 2022

parallel-kalman-jax cpu associative scan is very slow #9

parallel-kalman-jax cpu associative scan is very slow #9

Comments

murphyk commented Mar 22, 2022

AdrienCorenflos commented Apr 5, 2022