Take advantage of contiguous arrays #6

itamarst · 2024-08-13T13:20:14Z

NumPy views can point at non-contiguous chunks of memory. This means general purpose code needs to be able to accept both contiguous and non-contiguous memory, which means generic code that accepts NumPy arrays will have to assume non-contiguous memory. And this loses out on potential optimizations, in particular automatic usage of SIMD; if the compiler knows the array is contiguous, it can skip a bunch of stride computations and do more optimization.

Contiguous inputs are going to be very common; how common depends on the domain and function. So it would be good to get maximum speed for those.

That means compiling two versions of expensive functions, one for contiguous arrays and one for non-contiguous arrays, and choosing the appropriate one based on inputs. And as a library author I would like to do this with minimum code duplication!

Numba does this automatically, but for most languages this requires changes to the code.

itamarst · 2024-08-13T13:22:29Z

Cython support

Cython supports this by using fused types, with minimal code duplication. See the example here: https://pythonspeed.com/articles/faster-cython-simd/

It might be useful to document this pattern in the Cython documentation, at minimum. And it could in theory be added as language feature so as to minimize boilerplate.

itamarst · 2024-08-13T13:27:49Z

Rust support

The most commonly used crate for arrays is ndarray. It's unclear to me whether it can even generate code that's specifically for contiguous arrays.

rgommers · 2024-08-13T14:01:15Z

In many cases it will also be fine to only support contiguous arrays, and make a copy first when getting non-contiguous arrays (possibly in Python code, before passing it to a function in an extension module). This is a common patterns when using Pythran. The end result is usually better performance on the common case, while still supporting the non-common case.

itamarst · 2024-09-05T13:14:31Z

I'm a little wary of copying as a solution. High memory usage can have a significant impact on computation costs (RAM isn't cheap), and there's the risk of hitting the swapping performance cliff. And it's already super-easy to end up with way-too-high memory usage with explicit APIs.

So adding intermittent, hidden copying of large arrays seems like a bad idea in generic library APIs, at least. In the context of applications rather than libraries, where the author has better understanding of inputs and run time environment, it might be a good solution though.

itamarst changed the title ~~Takie advantage of contiguous arrays by compiling custom code for this common situation~~ Take advantage of contiguous arrays Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Take advantage of contiguous arrays #6

Take advantage of contiguous arrays #6

itamarst commented Aug 13, 2024 •

edited

Loading

itamarst commented Aug 13, 2024

itamarst commented Aug 13, 2024

rgommers commented Aug 13, 2024

itamarst commented Sep 5, 2024

Take advantage of contiguous arrays #6

Take advantage of contiguous arrays #6

Comments

itamarst commented Aug 13, 2024 • edited Loading

itamarst commented Aug 13, 2024

Cython support

itamarst commented Aug 13, 2024

Rust support

rgommers commented Aug 13, 2024

itamarst commented Sep 5, 2024

itamarst commented Aug 13, 2024 •

edited

Loading