Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rocprofiler-sdk support #1050

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Add rocprofiler-sdk support #1050

wants to merge 12 commits into from

Conversation

mwootton
Copy link
Contributor

@mwootton mwootton commented Feb 27, 2025

#1049

Convert to using rocprofiler-sdk instead of roctracer for collecting hip api calls and AMD gpu activity.

Reuses most existing roctracer infrastructure with a name for name replacement. Simultaneous support for both roctracer and rocprofiler-sdk was deemed impractical. This would require a whole new set of #ifdefs, a major refactor of the roctracer code, and additional build support. Even then, only one could be active at a time (and you wouldn't want both active).

In homage to the abandoned refactor, RocLogger.cpp/h were created to contain the rocprofbase classes and the api filter.

Roctracer has no established end date.
Rocprofiler-sdk is in rocm_3.1 forward.

This will create a dependency where (newest kineto on old rocm) and (old kineto on newest rocm) could fail to build with AMD gpu support. That window is already over 1 year wide.

@mwootton
Copy link
Contributor Author

Could someone remind me of the preferred method to run clang-tidy (or equivalent) on kineto code?
Thanks

@sraikund16
Copy link
Contributor

Could someone remind me of the preferred method to run clang-tidy (or equivalent) on kineto code? Thanks

Our internal CI runs it automatically, I can push whatever changes needed to the PR

@facebook-github-bot
Copy link
Contributor

@sraikund16 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mwootton
Copy link
Contributor Author

mwootton commented Feb 27, 2025

Our internal CI runs it automatically, I can push whatever changes needed to the PR

Full service, lovely.

The roctracer code has evolved quite a bit since my initial commit. I tried to swap this in with as few changes as possible. The specific implementation in RocprofLogger.cpp is very new and very different. There are a few comments in there, but overall rocprofiler-sdk is not super intuitive. Let me know what would help you or what you would like to see improved.

@facebook-github-bot
Copy link
Contributor

@mwootton has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@sraikund16 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mwootton has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@sraikund16 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@sraikund16
Copy link
Contributor

Quick update: we are still working on this to get our internal builds working. planning on having this in within the next couple weeks

@facebook-github-bot
Copy link
Contributor

@mwootton has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@sraikund16 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mwootton has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@sraikund16 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@sraikund16 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@sraikund16
Copy link
Contributor

sraikund16 commented Mar 12, 2025

@mwootton Tried running a profile with this and got a segfault on this line: https://github.com/mwootton/kineto/blob/rocprof/libkineto/src/RocprofLogger.cpp#L731

It looks like s never gets set (prints out as nullptr)

I think it is because we need to call this function first but we never do:
https://github.com/mwootton/kineto/blob/rocprof/libkineto/src/RocprofLogger.cpp#L303

Do you mind fixing this up?

@mwootton
Copy link
Contributor Author

I think it is because we need to call this function first but we never do: https://github.com/mwootton/kineto/blob/rocprof/libkineto/src/RocprofLogger.cpp#L303

Indeed that function is supposed to be called when hip loads. So 'automatic' to us, based on that symbol name.

I'll investigate a bit. It works for me as is. However, it was unclear to me if I should also link 'librocprofiler-register.so'.

Either way, this is a failure path that will happen on earlier versions of rocm that don't have rocprofiler-sdk support. So I'll guard against it.

Do you happen to know what rocm version you were using when you saw the failure?
Thanks.

@sraikund16
Copy link
Contributor

sraikund16 commented Mar 21, 2025

@mwootton The ROCM version is 6.2.0, but I forgot to mention this is failing specifically in our internal test(s). In these tests, we are only linking libamdhip64.so and librocprofiler-sdk.so and the test builds but then comes across the segfault. Do we need to add librocprofiler-register.so to the build of our tests? Will it automatically run that function if we link this?

@facebook-github-bot
Copy link
Contributor

@mwootton has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@mwootton has updated the pull request. You must reimport the pull request before landing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants