Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llama decomposed]: Parameter files missing for different decomposed models #472

Open
pdhirajkumarprasad opened this issue Nov 11, 2024 · 4 comments
Assignees

Comments

@pdhirajkumarprasad
Copy link

In nightly, we have different decomposed models like

testBenchmark8B_fp8_Decomposed
testBenchmark70B_fp8_TP8_Decomposed
testBenchmark405B_fp8_TP8_Decomposed

which are failing as parameter files are missing.

Error:

 File "/home/sai/actions-runner-llama/_work/SHARK-Platform/SHARK-Platform/deps/iree-turbine/iree/turbine/aot/params.py", line 234, in load
E               self._index.load(
E           ValueError: Error opening parameter file: c/runtime/src/iree/base/internal/file_io.c:253: NOT_FOUND; failed to open file '/data/llama-3.1/weights/405b/f8/llama405b_fp8.irpa'
E           
E           
E           Invoked with:
E             cd /home/sai/actions-runner-llama/_work/SHARK-Platform/SHARK-Platform && python3 -m sharktank.examples.export_paged_llm_v1 --irpa-file=/data/llama-3.1/weights/405b/f8/llama405b_fp8.irpa --output-mlir=/home/sai/actions-runner-llama/_work/SHARK-Platform/SHARK-Platform/2024-11-10/llama-405b/fp8_decomposed.mlir --output-config=/home/sai/actions-runner-llama/_work/SHARK-Platform/SHARK-Platform/2024-11-10/llama-405b/fp8_decomposed.json --bs=4 --attention-kernel decomposed
@dan-garvey
Copy link
Member

can you send me the machine that needs these via slack

@ScottTodd
Copy link
Member

Those tests should still be changed to not depend on the contents of the runner file system. Users and developers should be able to run these tests on their own systems.

  • Have the test declare what files it needs, and check during setup if those files exist
  • Use environment variables for cache locations (e.g. HF_HOME, or SHARK_HOME), not hardcoded paths (and especially not hardcoded paths that won't work at all on Windows)
  • If the files do not exist, fail the test and print instructions for downloading the files (e.g. print a script/command to run)

If the downloads are small enough, the test could download and cache automatically. For larger models, I'd probably fail the test (or skip with a reason) and print instructions to run some setup.

@dan-garvey
Copy link
Member

@ScottTodd I agree. However, as someone who is grateful these tests were written at all, I went ahead and updated the fs for now. we can leave this issue open if you'd like to use it to track the desired modality

@ScottTodd
Copy link
Member

SGTM. I'll keep speaking up whenever it breaks and needs fixing though :P. We'll need the tests decoupled from the machines our team directly controls eventually if we want users or other developers to be able to run them too. (Though for 405b that's much harder than it is for smaller models)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants