Releases: JuliaORNL/JACC.jl
Releases · JuliaORNL/JACC.jl
JACC v0.1.0
What's Changed
- Reorder GPU grid indices by @williamfgc in #104
- Reorder AMDGPU gridsize by @williamfgc in #105
- Revert reordering by @williamfgc in #106
- swapped thread dimension by @ygtangg in #107
- Changed thread dimension to: 1-32-32 by @ygtangg in #108
- Reverted GPU thread dimension: 32-32-1 by @ygtangg in #109
- Promoted JACC.shared for CUDA backend. Added test case in tests_cuda by @pedrovalerolara in #111
- Promoted JACC.shared for AMDGPU backend. Test added by @pedrovalerolara in #112
- Promoted JACC.shared OneAPI implementation. Added testing. by @pedrovalerolara in #113
- Blas level1 by @hetmankad in #110
- Added JACC.multi for CUDA. Testing doesn't work, I got a segmentation… by @pedrovalerolara in #116
- Added JACC.multi implementation for the AMDGPU backend, non-included … by @pedrovalerolara in #124
- Update JACCMULTI.jl by @pedrovalerolara in #125
- Update to latest checkout action by @williamfgc in #126
- Added stencil-aware funtions for JACC.multi on CUDA back ends. There … by @pedrovalerolara in #127
- Integrate other PRs for fixing extensions by @PhilipFackler in #123
- Bump Julia to 1.10 on AMD GPU CI on cousteau by @williamfgc in #128
- Added ghost cells support for JACC.multi on AMDGPU backend. There is … by @pedrovalerolara in #132
- Added range checks to parallel_for implementations by @PhilipFackler in #131
- Moved and refactored most tests as common portable versions by @PhilipFackler in #135
- Custom operators for
parallel_reduce
by @PhilipFackler in #120 - Capitalized modules Multi and Experimental and applied formatting by @PhilipFackler in #146
- Release v0.1.0 by @PhilipFackler in #147
- Fixed [compat] entry for julia by @PhilipFackler in #148
New Contributors
- @ygtangg made their first contribution in #107
- @hetmankad made their first contribution in #110
Full Changelog: v0.0.5...v0.1.0
JACC v0.0.5
What's Changed
- Add 3D parallel_for by @williamfgc in #102
- Added JACC.BLAS module, and fixed some bugs in parallel_reduce implem… by @pedrovalerolara in #90
- Added experimental sub-module and array by @pedrovalerolara in #92
Full Changelog: v0.0.4...v0.0.5
JACC v0.0.4
What's Changed
Remove precompilation due to the nature of package extension for back ends
Add support for Atomix.@atomic
Fixes for AMDGPU
- Fix AMDGPU by @williamfgc in #69
- Add precompile(false) by @williamfgc in #71
- Add suport for Atomix's
@atomic
across AMDGPU and CUDA by @williamfgc in #74 - Support portable zeros and ones convenience functions by @williamfgc in #75
- Improve block and thread calculations and invoke only if in range by @PhilipFackler in #76
New Contributors
- @PhilipFackler made their first contribution in #76
Full Changelog: v0.0.3...v0.0.4
JACC v0.0.3
What's Changed
- Refactor tests by @williamfgc in #64
- Fixed some bugs on AMDGPU backend regarding synchronization and use of shared memory. by @pedrovalerolara in #65
- Update Project.toml by @pedrovalerolara in #66
Full Changelog: v0.0.2...v0.0.3
JACC v0.0.2
Major Changes
- Added CI for NVIDIA RTX-A4000 and AMD M100 GPUs on ORNL systems, e.g. ExCL
- Added single and threaded CI for CPUs on Linux and macOS runners with macro
maybe_threaded
- Initial work on oneAPI back end
What's Changed
- Added oneAPI implementation and test by @pedrovalerolara in #14
- Fixed bugs, optimized implementation, added test codes in test/test-p… by @pedrovalerolara in #20
- New optimization for parallel reduce on CUDA, AMDGPU and oneAPI using… by @pedrovalerolara in #22
- Parallel reduce MN optimized for CUDA, AMDGPU, and oneAPI using multi… by @pedrovalerolara in #24
- Add AMD GPU CI on ExCL cousteau by @williamfgc in #25
- Add commands to verify CI environment. by @Geekdude in #28
- CI: Switched back to guibranco/github-status-action-v2 since the proxy fix was merged upstream. by @Geekdude in #38
- Revert "CI: Switched back to guibranco/github-status-action-v2 since … by @Geekdude in #39
- CI: Update github-status-action to v1.1.10 by @Geekdude in #41
- Bugs fixed in AMDGPU back end, indexes, group size in second kernel i… by @pedrovalerolara in #42
- Added performance tests for AMDGPU by @pedrovalerolara in #43
- Refactor CI and address AMD sync issue by @williamfgc in #47
- Upgrade GA CPU CI by @williamfgc in #48
- CI: Update Instantiate to load rocm, just like Test. by @Geekdude in #54
- Fix precompilation by @michel2323 in #52
- Introduce maybe_threaded by @williamfgc in #60
New Contributors
Full Changelog: v0.0.1...v0.0.2
JACC v0.0.1
Initial release with minimal functionality for parallel_for
and parallel_reduce
simple signatures.