This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
Thrust 1.10.0 (NVIDIA HPC SDK 20.9, CUDA Toolkit 11.2)
Thrust 1.10.0 is the major release accompanying the NVIDIA HPC SDK 20.9 release and the CUDA Toolkit 11.2 release. It drops support for C++03, GCC < 5, Clang < 6, and MSVC < 2017. It also overhauls CMake support. Finally, we now have a Code of Conduct for contributors: https://github.com/thrust/thrust/blob/main/CODE_OF_CONDUCT.md
Breaking Changes
- C++03 is no longer supported.
- GCC < 5, Clang < 6, and MSVC < 2017 are no longer supported.
- C++11 is deprecated. Using this dialect will generate a compile-time warning. These warnings can be suppressed by defining
THRUST_IGNORE_DEPRECATED_CPP_DIALECT
orTHRUST_IGNORE_DEPRECATED_CPP_11
. Suppression is only a short term solution. We will be dropping support for C++11 in the near future. - Asynchronous algorithms now require C++14.
- CMake < 3.15 is no longer supported.
- The default branch on GitHub is now called
main
. - Allocator and vector classes have been replaced with alias templates.
New Features
- #1159: CMake multi-config support, which allows multiple combinations of host and device systems to be built and tested at once. More details can be found here: https://github.com/thrust/thrust/blob/main/CONTRIBUTING.md#multi-config-cmake-options
- CMake refactoring:
- Added install targets to CMake builds.
- Added support for CUB tests and examples.
- Thrust can be added to another CMake project by calling
add_subdirectory
with the Thrust source root (see #976).
An example can be found here: https://github.com/thrust/thrust/blob/main/examples/cmake/add_subdir/CMakeLists.txt - CMake < 3.15 is no longer supported.
- Dialects are now configured through target properties. A new
THRUST_CPP_DIALECT
option has been added for single config mode. Logic that modifiedCMAKE_CXX_STANDARD
andCMAKE_CUDA_STANDARD
has been eliminated. - Testing related CMake code has been moved to
testing/CMakeLists.txt
- Example related CMake code has been moved to
examples/CMakeLists.txt
- Header testing related CMake code has been moved to
cmake/ThrustHeaderTesting.cmake
- CUDA configuration CMake code has been moved to to
cmake/ThrustCUDAConfig.cmake
. - Now we explicitly
include(cmake/*.cmake)
files rather than searchingCMAKE_MODULE_PATH
- we only want to use the ones in the repo.
thrust::transform_input_output_iterator
, a variant of transform iterator adapter that works as both an input iterator and an output iterator. The given input function is applied after reading from the wrapped iterator while the output function is applied before writing to the wrapped iterator. Thanks to Trevor Smith for this contribution.
Other Enhancements
- Contributor documentation: https://github.com/thrust/thrust/blob/main/CONTRIBUTING.md
- Code of Conduct: https://github.com/thrust/thrust/blob/main/CODE_OF_CONDUCT.md. Thanks to Conor Hoekstra for this contribution.
- Support for all combinations of host and device systems.
- C++17 support.
- #1221: Allocator and vector classes have been replaced with alias templates. Thanks to Michael Francis for this contribution.
- #1186: Use placeholder expressions to simplify the definitions of a number of algorithms. Thanks to Michael Francis for this contribution.
- #1170: More conforming semantics for scan algorithms:
- Follow P0571's guidance regarding intermediate types.
- https://wg21.link/P0571
- The accumulator's type is now:
- The type of the user-supplied initial value (if provided), or
- The input iterator's value type if no initial value.
- Follow C++ standard guidance for default binary operator type.
- https://eel.is/c++draft/exclusive.scan#1
- Thrust binary/unary functors now specialize a default void template parameter.
Types are deduced and forwarded transparently. - Updated the scan's default binary operator to the new
thrust::plus<>
specialization.
- The
thrust::intermediate_type_from_function_and_iterators
helper is no longer needed and has been removed.
- Follow P0571's guidance regarding intermediate types.
- #1255: Always use
cudaStreamSynchronize
instead ofcudaDeviceSynchronize
if the execution policy has a stream attached to it. Thanks to Rong Ou for this contribution. - #1201: Tests for correct handling of legacy and per-thread default streams. Thanks to Rong Ou for this contribution.
Bug Fixes
- #1260: Fix
thrust::transform_inclusive_scan
with heterogeneous types. Thanks to Rong Ou for this contribution. - #1258, NVC++ FS #28463: Ensure the CUDA radix sort backend synchronizes before returning; otherwise, copies from temporary storage will race with destruction of said temporary storage.
- #1264: Evaluate
CUDA_CUB_RET_IF_FAIL
macro argument only once. Thanks to Jason Lowe for this contribution. - #1262: Add missing
<stdexcept>
header. - #1250: Restore some
THRUST_DECLTYPE_RETURNS
macros in async test implementations. - #1249: Use
std::iota
inCUDATestDriver::target_devices
. Thanks to Michael Francis for this contribution. - #1244: Check for macro collisions with system headers during header testing.
- #1224: Remove unnecessary SFINAE contexts from asynchronous algorithms.
- #1190: Make
out_of_memory_recovery
test trigger faster. - #1187: Elminate superfluous iterators specific to the CUDA backend.
- #1181: Various fixes for GoUDA.
Thanks to Andrei Tchouprakov for this contribution. - #1178, #1229: Use transparent functionals in placeholder expressions, fixing issues with
thrust::device_reference
and placeholder expressions andthrust::find
with asymmetric equality operators. - thrust/thrust#1153: Switch to placement new instead of assignment to construct items in uninitialized memory. Thanks to Hugh Winkler for this contribution.
- #1050: Fix compilation of asynchronous algorithms when RDC is enabled.
- #1042: Correct return type of
thrust::detail::predicate_to_integral
frombool
toIntegralType
. Thanks to Andreas Hehn for this contribution. - #1009: Avoid returning uninitialized allocators. Thanks to Zhihao Yuan for this contribution.
- #990: Add missing
<thrust/system/cuda/memory.h>
include to<thrust/system/cuda/detail/malloc_and_free.h>
. Thanks to Robert Maynard for this contribution. - #966: Fix spurious MSVC conversion with loss of data warning in sort algorithms. Thanks to Zhihao Yuan for this contribution.
- Add more metadata to mock specializations for testing iterator in
testing/copy.cu
. - Add missing include to shuffle unit test.
- Specialize
thrust::wrapped_function
forvoid
return types because MSVC is not a fan of the patternreturn static_cast<void>(expr);
. - Replace deprecated
tbb/tbb_thread.h
with<thread>
. - Fix overcounting of initial value in TBB scans.
- Use
thrust::advance
instead of+=
for generic iterators. - Wrap the OMP flags in
-Xcompiler
for NVCC - Extend
ASSERT_STATIC_ASSERT
skip for the OMP backend. - Add missing header caught by
tbb.cuda
configs. - Fix "unsafe API" warnings in examples on MSVC:
s/fopen/fstream/
- Various C++17 fixes.