Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for architecture-accelerated features #2102

Open
maleadt opened this issue Sep 28, 2023 · 0 comments
Open

Support for architecture-accelerated features #2102

maleadt opened this issue Sep 28, 2023 · 0 comments
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request

Comments

@maleadt
Copy link
Member

maleadt commented Sep 28, 2023

i.e. sm_90a

In general, PTX code generated for one target architecture can be run on future architectures (i.e., it is forward compatible). However, CUDA 12.0 introduced the concept of "architecture-accelerated features" whose PTX does not have forward compatibility guarantees. Several Hopper PTX instructions fall under this category of architecture-accelerated features, and thus require a sm_90a target architecture (note the "a" appended). For more details on this and other architecture-accelerated instructions, please refer to the CUDA Documentation.

@maleadt maleadt added enhancement New feature or request cuda kernels Stuff about writing CUDA kernels. labels Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant