Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda.host_empty function #67

Open
wants to merge 135 commits into
base: devel
Choose a base branch
from
Open

cuda.host_empty function #67

wants to merge 135 commits into from

Conversation

smazouz42
Copy link

This pull request addresses issue #56 by adding a new feature to 'cuda' host_empty that allows you to allocate memory on the CPU

EmilyBourne and others added 30 commits June 27, 2024 08:10
This PR aims to make the C code compilable using nvcc. The cuda language was added as well as a CudaCodePrinter.

Changes to stdlib:

Wrapped expressions using complex types in an `ifndef __NVCC__` to avoid processing them with the nvcc compiler

---------

Co-authored-by: Mouad Elalj, EmilyBourne
This pull request fixes #48, by implementing a tiny wrapper for CUDA and a wrapper for non-CUDA functionalities only with external 'C'.

**Commit Summary**

-    Implemented new header printer for CUDA.
-    Added CUDA wrapper assignment
-    Instead of wrapping all local headers, wrap only C functions with extern 'C'

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
This pull request addresses issue #28 by implementing a new feature in
Pyccel that allows users to define custom GPU kernels. The syntax for
creating these kernels is inspired by Numba. and I also need to fix
issue #45 for testing purposes

**Commit Summary**

- Introduced KernelCall class
- Added cuda printer methods _print_KernelCall and _print_FunctionDef to
generate the corresponding CUDA representation for both kernel calls and
definitions
- Added IndexedFunctionCall  represents an indexed function call
- Added CUDA module and cuda.synchronize()
- Fixing a bug that I found in the header: it does not import the
necessary header for the used function

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
Co-authored-by: Emily Bourne <[email protected]>
This PR aims to make the C code compilable using nvcc. The cuda language was added as well as a CudaCodePrinter.

Changes to stdlib:

Wrapped expressions using complex types in an `ifndef __NVCC__` to avoid processing them with the nvcc compiler

---------

Co-authored-by: Mouad Elalj, EmilyBourne
This pull request fixes #48, by implementing a tiny wrapper for CUDA and a wrapper for non-CUDA functionalities only with external 'C'.

**Commit Summary**

-    Implemented new header printer for CUDA.
-    Added CUDA wrapper assignment
-    Instead of wrapping all local headers, wrap only C functions with extern 'C'

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
This pull request addresses issue #28 by implementing a new feature in
Pyccel that allows users to define custom GPU kernels. The syntax for
creating these kernels is inspired by Numba. and I also need to fix
issue #45 for testing purposes

**Commit Summary**

- Introduced KernelCall class
- Added cuda printer methods _print_KernelCall and _print_FunctionDef to
generate the corresponding CUDA representation for both kernel calls and
definitions
- Added IndexedFunctionCall  represents an indexed function call
- Added CUDA module and cuda.synchronize()
- Fixing a bug that I found in the header: it does not import the
necessary header for the used function

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
Co-authored-by: Emily Bourne <[email protected]>
This pull request addresses issue #59 by adding more CUDA-specific
keywords to enhance the checking of variable/function names and prevent
name clashes

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
This pull request addresses issue
#41 by implementing a new
feature in Pyccel that allows users to define a custom device

**Commit Summary**

- Adding handler for custom device and its code generation.
- Adding test

---------

Co-authored-by: EmilyBourne <[email protected]>
@smazouz42
Copy link
Author

/bot run docs

@smazouz42 smazouz42 marked this pull request as ready for review July 25, 2024 09:59
Copy link

@pyccel-bot pyccel-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job ! Your PR is using all the code it added/changed.

@pyccel-bot
Copy link

pyccel-bot bot commented Jul 25, 2024

@jalalium, @smazouz42 has been working hard and thinks that they have now replied to or fixed all your comments. Could you take another look at the PR and see if you can approve now?

EmilyBourne and others added 8 commits July 26, 2024 14:08
This PR aims to make the C code compilable using nvcc. The cuda language was added as well as a CudaCodePrinter.

Changes to stdlib:

Wrapped expressions using complex types in an `ifndef __NVCC__` to avoid processing them with the nvcc compiler

---------

Co-authored-by: Mouad Elalj, EmilyBourne
This pull request fixes #48, by implementing a tiny wrapper for CUDA and a wrapper for non-CUDA functionalities only with external 'C'.

**Commit Summary**

-    Implemented new header printer for CUDA.
-    Added CUDA wrapper assignment
-    Instead of wrapping all local headers, wrap only C functions with extern 'C'

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
This pull request addresses issue #28 by implementing a new feature in
Pyccel that allows users to define custom GPU kernels. The syntax for
creating these kernels is inspired by Numba. and I also need to fix
issue #45 for testing purposes

**Commit Summary**

- Introduced KernelCall class
- Added cuda printer methods _print_KernelCall and _print_FunctionDef to
generate the corresponding CUDA representation for both kernel calls and
definitions
- Added IndexedFunctionCall  represents an indexed function call
- Added CUDA module and cuda.synchronize()
- Fixing a bug that I found in the header: it does not import the
necessary header for the used function

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
Co-authored-by: Emily Bourne <[email protected]>
This pull request addresses issue #59 by adding more CUDA-specific
keywords to enhance the checking of variable/function names and prevent
name clashes

---------

Co-authored-by: EmilyBourne <[email protected]>
Co-authored-by: bauom <[email protected]>
This pull request addresses issue
#41 by implementing a new
feature in Pyccel that allows users to define a custom device

**Commit Summary**

- Adding handler for custom device and its code generation.
- Adding test

---------

Co-authored-by: EmilyBourne <[email protected]>
Copy link
Member

@jalalium jalalium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good job!

@pyccel-bot pyccel-bot bot added Ready_for_review Received at least one approval. Requires review from senior developer and removed needs_initial_review labels Jul 26, 2024
@pyccel-bot
Copy link

pyccel-bot bot commented Jul 26, 2024

Hey @yguclu, @EmilyBourne, this PR is looking pretty good. @smazouz42 and @jalalium think it is ready to merge. Could you add your expertise to confirm that this follows all the coding conventions and fits in Pyccel's future plans? Thanks 😄

@pyccel-bot pyccel-bot bot requested review from EmilyBourne and yguclu July 26, 2024 15:51

### cuda+host_empty

The cuda+host_empty function allocates an empty array on the host.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is cuda+host_empty cuda.host_empty?

Comment on lines +29 to +30
'IntegerClass',
'FloatClass',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These classes now appear twice in this list

elif isinstance(class_type, (NumpyNumericType, NumpyNDArrayType)):
return NumpyArrayClass
# elif isinstance(class_type, StackArrayType):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# elif isinstance(class_type, StackArrayType):

Comment on lines +52 to +55
def __init__(self, *args ,class_type, init_dtype, memory_location):
self._class_type = class_type
self._init_dtype = init_dtype
self._memory_location = memory_location
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the memory location not inside the class type already?

Suggested change
def __init__(self, *args ,class_type, init_dtype, memory_location):
self._class_type = class_type
self._init_dtype = init_dtype
self._memory_location = memory_location
def __init__(self, *args, class_type, init_dtype, memory_location):
self._class_type = class_type
self._init_dtype = init_dtype
self._memory_location = memory_location

Comment on lines +145 to +146
'full' : PyccelFunctionDef('full' , CudaFull),
'host_empty' : PyccelFunctionDef('host_empty' , CudaHostEmpty),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'full' : PyccelFunctionDef('full' , CudaFull),
'host_empty' : PyccelFunctionDef('host_empty' , CudaHostEmpty),
'full' : PyccelFunctionDef('full' , CudaFull),
'host_empty' : PyccelFunctionDef('host_empty' , CudaHostEmpty),

Comment on lines +174 to +177
if isinstance(rhs.class_type, CudaArrayType):
if(isinstance(rhs, (CudaFull))):
# TODO add support for CudaFull
return " \n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not safer to not include this code so that the neat error is raised instead of just printing nothing?

accelerators=('python',))),
"numpy_f90" : ("numpy", CompileObj("numpy_f90.f90",folder="numpy")),
"numpy_c" : ("numpy", CompileObj("numpy_c.c",folder="numpy")),
"Set_extensions" : ("STC_Extensions", CompileObj("Set_Extensions.h",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Bad merge?

array
The empty array on the host.
"""
import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually in Python best practice is to place all imports at the top of the file. Is there a reason you don't do that here?

Comment on lines +75 to +89
return (1);
}

__host__ __device__
int32_t cuda_free(t_ndarray arr)
{
if (arr.shape == NULL)
return (0);
cudaFree(arr.raw_data);
arr.raw_data = NULL;
cudaFree(arr.shape);
arr.shape = NULL;
cudaFree(arr.strides);
arr.strides = NULL;
return (0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the 1/0 returned for host/device freeing represent?

@@ -58,7 +58,8 @@ include = [
"pyccel/stdlib/**/*.c",
"pyccel/stdlib/**/*.f90",
"pyccel/extensions/STC/include",
"pyccel/extensions/gFTL/include/v2"
"pyccel/extensions/gFTL/include/v2",
"pyccel/stdlib/cuda_ndarrays/cuda_ndarrays.cu"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please group this with the other stdlib files. I think we can safely include all cuda files found inside stdlib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready_for_review Received at least one approval. Requires review from senior developer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Python code to spoof a host_empty function
4 participants