-
Notifications
You must be signed in to change notification settings - Fork 1
Katy SIMD code generator
License
andyross/katy
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Katy[1] is a parallel computing environment aimed at wide CPU SIMD implementations like AVX. (Support for AVX1 exists right now, but the integer types and gather instructions from AVX2 are still unimplemented.) It provides a "macro-like" C++ API for expressing code that can then be executed directly in the host process via function pointer invocation on array data. The internal code generator is an optimizing compiler, implementing an SSA framework to do common subexpression and dead code elimination, copy propagation, constant folding, loop hoisting, and a range of targetted "factorization" optimizations. There is also a library implementation of trancendental math functions, a triangle rasterization subroutine and a almost-completely OpenGL-compatible texture lookup (mip mapping, trilinear filtering and anisotropy are all supported!) and framebuffer implementation. These are wired only to a test routine that draw's the author's face into a PNM file, but the test works successfully and benchmarks on an Ivy Bridge laptop show that an amortized output fragment textured via trilinear filtering (i.e. 8 interpolated reads per pixel) can be done in just 60 cycles. The C++ API is, as mentioned, "macro-like". Commands to emit instructions are implemented as overloaded operators on a "vr" class which represents a single virtual SIMD register. The resulting code looks in source very much like the intended algorithm, with surrounding C++ code providing a macro language for metaprogramming. For an example, see the texture code generation. There is a single code generation path for 1-, 2-, and 3-dimensional textures, with a compile-time C++ loop over dimensions. It doesn't do anything "real" yet, though most of the hard work for a 3D renderer is complete and plumbing out (for example) a Mesa/Gallium backend should be possible. What do I do?! ============== + "make" to build, "make test" to run (almost all of) the existing tests. These can run in either a single-threaded bytecode interpreter or as native code. On an AVX-capable machine it will run both, but the interpreter is always available (even on other architectures). + Run one of the test binaries (they all use the same framework) with "--help" to see what can be done. There is a set of named tests that can be run independently. + Try a single test --dump-asm to see the generated assembly interleaved with the bytecode intermediate language. Compare with the C++ source for the test. + Try --log --no-avx to watch the interpreter run step by step and log its execution. + Under gdb, run a test with --break to issue a debugger breakpoint instruction (int 3) at the beginning of the generated code. Then run it and step through it. [1] The Katy Freeway along Interstate 10 in Houston is the widest automobile highway in the world, with as many as 26 lanes of traffic along some stretches. The author has never seen it.
About
Katy SIMD code generator
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published