You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Schur decomposition is implemented efficiently on GPU in PyTorch, Jax, and MatLab. It is an essential ingredient for the efficient computation of matrix square root (sqrtm).
It would be ideal to have this decomposition implemented directly in CUDA.jl. The only alternative I see at the moment is full-on diagonalization which is less efficient and less stable.
The text was updated successfully, but these errors were encountered:
Schur decomposition is implemented efficiently on GPU in PyTorch, Jax, and MatLab. It is an essential ingredient for the efficient computation of matrix square root (sqrtm).
It would be ideal to have this decomposition implemented directly in CUDA.jl. The only alternative I see at the moment is full-on diagonalization which is less efficient and less stable.
The text was updated successfully, but these errors were encountered: