Performance of Wrapped & Structured Arrays #373

avik-pal · 2024-12-13T13:16:41Z

Problem Description

Currently, our approach to dealing with Wrapped Arrays is (considering the case of mul!:

function mul!(C::TracedRArray, B::AnyTracedRArray, A::AnyTracedRArray)
    B = materialize_traced_array(B)
    A = materialize_traced_array(A)
    Ops.dot_general(....)
    return C
end

This ensures that the code works as long as a wrapper type implements materialize_traced_array. But this is not the most efficient solution, trivial to see with the simple case of a Diagonal wrapper (see @mofeing's comment #369 (comment) for implementation of Diagonal using dot_general)

Current list of slow fallbacks

mul!
diag
diagm

The text was updated successfully, but these errors were encountered:

mofeing · 2024-12-13T14:08:59Z

This is more of a $N \times M$ problem: it's the combination of array types with methods. For example, a PermutedDimsArray on a matrix multiplication or on a more genera einsum can also have a more efficient implementation without materializing it. But there might be some array type or method where using the default is fine.

Also, check it out that thanks to the high-level opt passes we added in Enzyme-JAX, the implementation of some of the default implementations could have the same performance as writing it by hand. But I prefer to directly emit optimal code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of Wrapped & Structured Arrays #373

Performance of Wrapped & Structured Arrays #373

avik-pal commented Dec 13, 2024

mofeing commented Dec 13, 2024

Performance of Wrapped & Structured Arrays #373

Performance of Wrapped & Structured Arrays #373

Comments

avik-pal commented Dec 13, 2024

Problem Description

Current list of slow fallbacks

mofeing commented Dec 13, 2024