Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Implement "lower-level" APIs for cuda.parallel that do not accept array inputs? #3812

Open
leofang opened this issue Feb 14, 2025 · 0 comments

Comments

@leofang
Copy link
Member

leofang commented Feb 14, 2025

we could consider changing the API to not accept __cuda_array_interface__ objects, and instead have the user pass in the required information (pointer, size, dtype, etc.,). This allows each library/user to compute that information in the most efficient way possible rather than making it our responsibility.

Let's have a separate issue to track this. Thinking about this more we should try to make the current (low-level) interface look more like a 1:1 binding to the bare C++ one. This is what we do for cuda.cooperative too. Pythonic interface can come later.

Originally posted by @leofang in #3718 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

1 participant