-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Initial itkHalideDiscreteGaussianImageFilter #10
Conversation
Implemented with separable convolution. Handles separate sigma and kernel size for each dimension to support anisotropic images. Strip out the CastImageFilter and MinimalStandardRandomVariateGenerator placeholders.
About 200ms on my machine. Bandwidth-limited on reasonable kernel/image sizes.
for parity with itk::DiscreteGaussianImageFilter.
It wasn't used anyway, but this should make the filter better-behaved. Not sure how to disable/remove the MultiThreader object; setting it null is not valid.
Change the test to blur CTChest. The reference output was generated with itk::DiscreteGaussianImageFilter directly, so the Halide output is verified to be numerically compatible.
With f968c14 the schedule is not so naive. It blurs CTChest (512x512x139) in about 230ms for all reasonable kernel sizes. Memory bandwidth is the bottleneck, there's simply not enough compute to really stress the GPU until more unreasonable kernel sizes (order of 100x100x100). Because of that memory bandwidth limitation, CPU itkDiscreteGaussianImageFilter beats this up till kernel size around 10x10x10 or so. After that point the CPU compute is slower than the GPU bandwidth. So with that I'll say it closes #7 although there certainly is room for further improvement. I'll refine as I generate benchmark figures for a wider range of image and kernel sizes. |
The linter is happy, so I'll go ahead and merge despite failing checks due to #9. |
Closes #3, closes #4, closes #7.
Progress toward #5.
Running the local build in Python takes a bit of work - your local ITK build has a file
build/Wrapping/Generators/Python/WrapITK.pth
; you must manually copy or link this file in your Python environment'ssite-packages
.I tried setting up a Jupyter notebook to test the thing locally and generate benchmarks for #5; there is some difficulty here since I can't easily install
itkwidgets
(this pip-installsitk
and overridesWrapITK.pth
) but that's not strictly a blocker, it just makes it difficult to validate results. We can generate plots despite this. In the meantime I created a thin ScriptedCLI module for Slicer to validate results.The test needs fixed, but this is not trivial since test input and expected output hash need to be updated.