Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update RoCM substitutions for tinygrad #367501

Merged
merged 2 commits into from
Dec 23, 2024

Conversation

deftdawg
Copy link
Contributor

AMD replaced their hardcoded /opt/rocm/lib paths with ROCM_PATH, '/opt/rocm/' variables, the old substituteInPlace no longer matched the strings and did the substitutions resulting in the libraries not being found at runtime.

When I changed this I had to disable running tests (not part of this PR) because the tests crash the machine I'm using... (yay AMD!) ; for that have to do doCheck = !rocmSupport;.

FYI @GaetanLepage

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

AMD replaced their hardcoded `/opt/rocm/lib` paths with `ROCM_PATH, '/opt/rocm/'` variables, the old `substituteInPlace` no longer matched the strings and did the substitutions resulting in the libraries not being found at runtime.
@github-actions github-actions bot added 6.topic: python 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux labels Dec 23, 2024
Copy link
Contributor

@GaetanLepage GaetanLepage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@GaetanLepage
Copy link
Contributor

cc @SomeoneSerge @wozeparrot

@wozeparrot
Copy link
Member

There might be a patch needed for the /opt/rocm here: https://github.com/tinygrad/tinygrad/blob/a556adf0280c66e9baceb43cf280e3bfad30056e/tinygrad/runtime/support/compiler_hip.py#L43

At least I have this patched in tinygrad-nix.

But otherwise, LGTM.

@wozeparrot's fixes for compiler_hip.py
@deftdawg
Copy link
Contributor Author

At least I have this patched in tinygrad-nix.

Thanks @wozeparrot, I added patches for compiler_hip.py too... was able to get through testing (ran over SSH maybe that's why).

cd into the nix/store/tinygrad directory and ran some checks

# find any /opt/rocm references; still a couple but they're commments
rg "/opt/rocm" * 
# check that all replaced instances actually match an existing file
for f in $(grep -roE "/nix/store/[^'\" ]*" * 2>/dev/null | cut -d: -f2); do [ ! -e ${f} ] && echo $f is not found; done

@GaetanLepage
Copy link
Contributor

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 367501

@GaetanLepage GaetanLepage merged commit ef27bba into NixOS:master Dec 23, 2024
23 of 24 checks passed
@deftdawg deftdawg deleted the tinygrad-rocm-fix branch December 23, 2024 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: python 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants