This repository contains easy-to-read Python/CUDA implementations of fundamental GPU computing primitives: map, reduce, prefix sum (scan), split, radix sort, and histogram. I use these primitives to construct easy-to-read Python/CUDA implementations of the following image processing operations: Gaussian blurring, bilateral filtering, histogram equalization, red-eye removal, and seamless image cloning.
This code can be browsed online with the IPython Notebook Viewer using the links below.