Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update user docs for running llm server #676

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

stbaione
Copy link
Contributor

Description

Did a pass through and made updates + fixes to the user docs for e2e_llama8b_mi300x.md.

  1. Update install instructions for shark-ai
  2. Update nightly install instructions for shortfin and sharktank
  3. Update paths for model artifacts to ensure they work with llama3.1-8b-fp16-instruct
  4. Remove steps to write edited config. No longer needed after Make config.json consistent between shortfin and sharktank #487

Added back sentencepiece as a requirement for sharktank. Not having it caused export_paged_llm_v1 to break when installing nightly:

ModuleNotFoundError: No module named 'sentencepiece'

This was obfuscated when building from source, because shortfin includes sentencepiece in requirements-tests.txt.

Add back `sentencepiece` as requirement for `sharktank` to enable `export_paged_llm_v1`
@stbaione stbaione requested a review from ScottTodd December 11, 2024 16:33
Comment on lines +27 to +28
You can install either the `latest stable` version of shortfin by installing
`shark-ai` or the `nightly` version directly:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just install shortfin or shortfin[apps] too. All the meta shark-ai package does is pin to matching versions of all packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I felt like it was better branding to use shark-ai for that part. I can switch it to shortfin[apps] though. What do you think?

Comment on lines 7 to +8
# Needed for newer gguf versions (TODO: remove when gguf package includes this)
# sentencepiece>=0.1.98,<=0.2.0
sentencepiece>=0.1.98,<=0.2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:/ looks like gguf==0.10.0 still only includes these deps:

Requires-Dist: numpy (>=1.17)
Requires-Dist: pyyaml (>=5.1)
Requires-Dist: tqdm (>=4.27)

despite the inclusion in requirements: https://github.com/ggerganov/llama.cpp/blob/1a31d0dc00ba946d448e16ecc915ce5e8355994e/gguf-py/pyproject.toml#L21-L26

I didn't see an upstream issue at a glance, but we should file one and follow up upstream instead of carrying around these downstream patches forever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started typing out an issue, but realized that the pyproject.toml linked above is included in v0.11.0, but not in v0.10.0.

Looks like v0.11.0 was released an hour ago. Thinking we could try an upgrade?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes let's try upgrading. The metadata is there in the .whl file now.

Requires-Dist: numpy (>=1.17)
Requires-Dist: pyyaml (>=5.1)
Requires-Dist: sentencepiece (>=0.1.98,<=0.2.0)
Requires-Dist: tqdm (>=4.27)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants