-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda.host_empty function #67
base: devel
Are you sure you want to change the base?
Conversation
This PR aims to make the C code compilable using nvcc. The cuda language was added as well as a CudaCodePrinter. Changes to stdlib: Wrapped expressions using complex types in an `ifndef __NVCC__` to avoid processing them with the nvcc compiler --------- Co-authored-by: Mouad Elalj, EmilyBourne
This pull request fixes #48, by implementing a tiny wrapper for CUDA and a wrapper for non-CUDA functionalities only with external 'C'. **Commit Summary** - Implemented new header printer for CUDA. - Added CUDA wrapper assignment - Instead of wrapping all local headers, wrap only C functions with extern 'C' --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]> Co-authored-by: Emily Bourne <[email protected]>
…nctions, and refining CUDA type handling
This PR aims to make the C code compilable using nvcc. The cuda language was added as well as a CudaCodePrinter. Changes to stdlib: Wrapped expressions using complex types in an `ifndef __NVCC__` to avoid processing them with the nvcc compiler --------- Co-authored-by: Mouad Elalj, EmilyBourne
This pull request fixes #48, by implementing a tiny wrapper for CUDA and a wrapper for non-CUDA functionalities only with external 'C'. **Commit Summary** - Implemented new header printer for CUDA. - Added CUDA wrapper assignment - Instead of wrapping all local headers, wrap only C functions with extern 'C' --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]> Co-authored-by: Emily Bourne <[email protected]>
This pull request addresses issue #59 by adding more CUDA-specific keywords to enhance the checking of variable/function names and prevent name clashes --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]>
This pull request addresses issue #41 by implementing a new feature in Pyccel that allows users to define a custom device **Commit Summary** - Adding handler for custom device and its code generation. - Adding test --------- Co-authored-by: EmilyBourne <[email protected]>
/bot run docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job ! Your PR is using all the code it added/changed.
@jalalium, @smazouz42 has been working hard and thinks that they have now replied to or fixed all your comments. Could you take another look at the PR and see if you can approve now? |
This PR aims to make the C code compilable using nvcc. The cuda language was added as well as a CudaCodePrinter. Changes to stdlib: Wrapped expressions using complex types in an `ifndef __NVCC__` to avoid processing them with the nvcc compiler --------- Co-authored-by: Mouad Elalj, EmilyBourne
This pull request fixes #48, by implementing a tiny wrapper for CUDA and a wrapper for non-CUDA functionalities only with external 'C'. **Commit Summary** - Implemented new header printer for CUDA. - Added CUDA wrapper assignment - Instead of wrapping all local headers, wrap only C functions with extern 'C' --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]> Co-authored-by: Emily Bourne <[email protected]>
This pull request addresses issue #59 by adding more CUDA-specific keywords to enhance the checking of variable/function names and prevent name clashes --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]>
This pull request addresses issue #41 by implementing a new feature in Pyccel that allows users to define a custom device **Commit Summary** - Adding handler for custom device and its code generation. - Adding test --------- Co-authored-by: EmilyBourne <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good job!
Hey @yguclu, @EmilyBourne, this PR is looking pretty good. @smazouz42 and @jalalium think it is ready to merge. Could you add your expertise to confirm that this follows all the coding conventions and fits in Pyccel's future plans? Thanks 😄 |
|
||
### cuda+host_empty | ||
|
||
The cuda+host_empty function allocates an empty array on the host. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is cuda+host_empty
cuda.host_empty
?
'IntegerClass', | ||
'FloatClass', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These classes now appear twice in this list
elif isinstance(class_type, (NumpyNumericType, NumpyNDArrayType)): | ||
return NumpyArrayClass | ||
# elif isinstance(class_type, StackArrayType): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# elif isinstance(class_type, StackArrayType): |
def __init__(self, *args ,class_type, init_dtype, memory_location): | ||
self._class_type = class_type | ||
self._init_dtype = init_dtype | ||
self._memory_location = memory_location |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the memory location not inside the class type already?
def __init__(self, *args ,class_type, init_dtype, memory_location): | |
self._class_type = class_type | |
self._init_dtype = init_dtype | |
self._memory_location = memory_location | |
def __init__(self, *args, class_type, init_dtype, memory_location): | |
self._class_type = class_type | |
self._init_dtype = init_dtype | |
self._memory_location = memory_location |
'full' : PyccelFunctionDef('full' , CudaFull), | ||
'host_empty' : PyccelFunctionDef('host_empty' , CudaHostEmpty), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'full' : PyccelFunctionDef('full' , CudaFull), | |
'host_empty' : PyccelFunctionDef('host_empty' , CudaHostEmpty), | |
'full' : PyccelFunctionDef('full' , CudaFull), | |
'host_empty' : PyccelFunctionDef('host_empty' , CudaHostEmpty), |
if isinstance(rhs.class_type, CudaArrayType): | ||
if(isinstance(rhs, (CudaFull))): | ||
# TODO add support for CudaFull | ||
return " \n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not safer to not include this code so that the neat error is raised instead of just printing nothing?
accelerators=('python',))), | ||
"numpy_f90" : ("numpy", CompileObj("numpy_f90.f90",folder="numpy")), | ||
"numpy_c" : ("numpy", CompileObj("numpy_c.c",folder="numpy")), | ||
"Set_extensions" : ("STC_Extensions", CompileObj("Set_Extensions.h", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? Bad merge?
array | ||
The empty array on the host. | ||
""" | ||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually in Python best practice is to place all imports at the top of the file. Is there a reason you don't do that here?
return (1); | ||
} | ||
|
||
__host__ __device__ | ||
int32_t cuda_free(t_ndarray arr) | ||
{ | ||
if (arr.shape == NULL) | ||
return (0); | ||
cudaFree(arr.raw_data); | ||
arr.raw_data = NULL; | ||
cudaFree(arr.shape); | ||
arr.shape = NULL; | ||
cudaFree(arr.strides); | ||
arr.strides = NULL; | ||
return (0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the 1/0 returned for host/device freeing represent?
@@ -58,7 +58,8 @@ include = [ | |||
"pyccel/stdlib/**/*.c", | |||
"pyccel/stdlib/**/*.f90", | |||
"pyccel/extensions/STC/include", | |||
"pyccel/extensions/gFTL/include/v2" | |||
"pyccel/extensions/gFTL/include/v2", | |||
"pyccel/stdlib/cuda_ndarrays/cuda_ndarrays.cu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please group this with the other stdlib files. I think we can safely include all cuda files found inside stdlib.
81b9970
to
5f7e3e2
Compare
8bce3e1
to
26890c6
Compare
This pull request addresses issue #56 by adding a new feature to 'cuda' host_empty that allows you to allocate memory on the CPU