Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a concurrent kwarg to profiling macros. #2201

Merged
merged 1 commit into from
Dec 12, 2023
Merged

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Dec 12, 2023

This allows switching between concurrent and serial profiling modes. The former has less of an impact on an application's performance characteristics, but requires kernel instrumentation. The latter doesn't, and generally has less overhead.

cc @thomasfaingnaert concurrent=false is a kwarg that should be accepted by all profiling macros. I haven't seen significant improvements, but as per NVIDIA this may depend on the type of kernel:

serial: For applications which use only a single CUDA stream and therefore cannot have concurrent kernel execution, this mode can be useful as it usually (not always) incurs less profiling overhead compared to the concurrent kernel mode.

concurrent: Due to the code instrumentation, concurrent kernel mode can add significant runtime overhead if used on kernels that execute a large number of blocks and that have short execution durations.

This allows switching between concurrent and serial profiling modes.
The former has less of an impact on an application's performance
characteristics, but requires kernel instrumentation. The latter
doesn't, and generally has less overhead.
Copy link

codecov bot commented Dec 12, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8a0d39a) 72.93% compared to head (4e34ea6) 72.94%.
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2201      +/-   ##
==========================================
+ Coverage   72.93%   72.94%   +0.01%     
==========================================
  Files         159      159              
  Lines       14651    14643       -8     
==========================================
- Hits        10685    10682       -3     
+ Misses       3966     3961       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maleadt maleadt merged commit 54ed7f7 into master Dec 12, 2023
@maleadt maleadt deleted the tb/serial_profiling branch December 12, 2023 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant