Matrix slicing + stability update.
Patch:
- 2D matrices can be sliced now (zero-copy) on both CPU and GPU. Used CRTP to separate matrix operations from matrix data. This made it easier to switch between matrix data implementations without having inheritance overhead.
- Added more tests for reductions
Fixes:
- Merging two null measurements causes division by zero. Especially important for masked operations.
- Fixed wrong indexing in statistics_p2l CUDA implementation