-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add environment to encapsulate information needed for cudax::vector
#2775
base: main
Are you sure you want to change the base?
Conversation
🟩 CI finished in 1h 28m: Pass: 100%/54 | Total: 4h 35m | Avg: 5m 05s | Max: 20m 32s | Hits: 80%/250
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 54)
# | Runner |
---|---|
43 | linux-amd64-cpu16 |
5 | linux-amd64-gpu-v100-latest-1 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
0eaf34c
to
1e25865
Compare
🟩 CI finished in 25m 37s: Pass: 100%/54 | Total: 4h 38m | Avg: 5m 09s | Max: 17m 30s | Hits: 77%/252
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 54)
# | Runner |
---|---|
43 | linux-amd64-cpu16 |
5 | linux-amd64-gpu-v100-latest-1 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
7389cf4
to
37c27da
Compare
get_memory_resource
CPO to query a type for a memory resourcecudax::vector
@@ -51,7 +51,8 @@ _CCCL_INLINE_VAR constexpr bool has_property< | |||
_CUDA_VSTD::void_t<decltype(get_property(_CUDA_VSTD::declval<const _Resource&>(), _CUDA_VSTD::declval<_Property>()))>> = | |||
true; | |||
|
|||
# if defined(_CCCL_COMPILER_NVHPC) // NVHPC has issues accepting this at compile time if it is in a variable template | |||
// NVHPC and NVCC have issues accepting this at compile time if it is in a variable template | |||
# if defined(_CCCL_COMPILER_NVHPC) || defined(_CCCL_CUDA_COMPILER_NVCC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericniebler I was facing internal compiler errors with nvcc too
🟨 CI finished in 3h 06m: Pass: 99%/394 | Total: 2d 04h | Avg: 8m 00s | Max: 41m 45s | Hits: 91%/21175
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 394)
# | Runner |
---|---|
326 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
25 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
//! @param __mr The any_resource passed in | ||
//! @param __stream The stream_ref passed in | ||
//! @param __policy The execution_policy passed in | ||
_CCCL_HIDE_FROM_ABI env_t(__resource __mr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to do anything to allow constructing an environment with the ctor arguments in any order for convenience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can just build their own env and pass that around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems not hard to allow what jake is suggesting. also, if the user passes an instance of any_resource<Properties...>
we should be able to deduce the class template parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/home/miscco/cccl/cudax/test/execution/env.cu(112): error: cannot deduce class template arguments
cudax::execution::env_t env{cudax::mr::any_async_resourcecuda::mr::device_accessible{test_resource{}}, stream};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. it would require changes to support CTAD.
//! @param __mr The any_resource passed in | ||
//! @param __stream The stream_ref passed in | ||
//! @param __policy The execution_policy passed in | ||
_CCCL_HIDE_FROM_ABI env_t(__resource __mr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we ever want to build an environment in device code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is _CCCL_HIDE_FROM_ABI
which only does the hidden thing.
_LIBCUDACXX_HIDE_FROM_ABI
is the one with host device in it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i know. that's why i'm asking. so? will we ever want to build an environment in device code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I would say no, but we can revise that later
return (this->__static_vtable->__equal_fn == __rhs.__static_vtable->__equal_fn) | ||
&& this->__static_vtable->__equal_fn(this->_Get_object(), __rhs._Get_object()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this extra overload needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that was a fallout from the rework of how we determine whether two any_resource
s are compatible.
In the tests I could not compare two identical any_resource
s.
I thought that should have been covered here but 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd rather we got to the bottom of the problem.
//! @param __mr The any_resource passed in | ||
//! @param __stream The stream_ref passed in | ||
//! @param __policy The execution_policy passed in | ||
_CCCL_HIDE_FROM_ABI env_t(__resource __mr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems not hard to allow what jake is suggesting. also, if the user passes an instance of any_resource<Properties...>
we should be able to deduce the class template parameters.
37c27da
to
bd83f68
Compare
🟨 CI finished in 1h 31m: Pass: 99%/394 | Total: 2d 02h | Avg: 7m 41s | Max: 46m 09s | Hits: 90%/21175
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 394)
# | Runner |
---|---|
326 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
25 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
b786f8f
to
6e586b8
Compare
🟨 CI finished in 1h 30m: Pass: 98%/394 | Total: 1d 21h | Avg: 6m 57s | Max: 44m 58s | Hits: 98%/23306
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 394)
# | Runner |
---|---|
326 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
25 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
We need something that encapsulates the information we need for containers and algorithms, so start with a proper environment
8f375b3
to
7458002
Compare
🟨 CI finished in 1h 11m: Pass: 98%/396 | Total: 2d 09h | Avg: 8m 46s | Max: 1h 01m | Hits: 82%/19468
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 396)
# | Runner |
---|---|
327 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
26 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
🟨 CI finished in 1h 37m: Pass: 99%/396 | Total: 2d 05h | Avg: 8m 02s | Max: 57m 24s | Hits: 92%/19468
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 396)
# | Runner |
---|---|
327 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
26 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
d8f22c2
to
ef034bb
Compare
We need to pass around a ton of information to efficiently use a container in a heterogeneous setting.
This adds more sender like queries
get_memory_resource_t
andget_execution_policy_t
which we can then use to define a proper environment