-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update user docs for running llm server
#676
base: main
Are you sure you want to change the base?
Conversation
Add back `sentencepiece` as requirement for `sharktank` to enable `export_paged_llm_v1`
You can install either the `latest stable` version of shortfin by installing | ||
`shark-ai` or the `nightly` version directly: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just install shortfin
or shortfin[apps]
too. All the meta shark-ai
package does is pin to matching versions of all packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I felt like it was better branding to use shark-ai
for that part. I can switch it to shortfin[apps]
though. What do you think?
# Needed for newer gguf versions (TODO: remove when gguf package includes this) | ||
# sentencepiece>=0.1.98,<=0.2.0 | ||
sentencepiece>=0.1.98,<=0.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:/ looks like gguf==0.10.0
still only includes these deps:
Requires-Dist: numpy (>=1.17)
Requires-Dist: pyyaml (>=5.1)
Requires-Dist: tqdm (>=4.27)
despite the inclusion in requirements: https://github.com/ggerganov/llama.cpp/blob/1a31d0dc00ba946d448e16ecc915ce5e8355994e/gguf-py/pyproject.toml#L21-L26
I didn't see an upstream issue at a glance, but we should file one and follow up upstream instead of carrying around these downstream patches forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes let's try upgrading. The metadata is there in the .whl file now.
Requires-Dist: numpy (>=1.17)
Requires-Dist: pyyaml (>=5.1)
Requires-Dist: sentencepiece (>=0.1.98,<=0.2.0)
Requires-Dist: tqdm (>=4.27)
Description
Did a pass through and made updates + fixes to the user docs for
e2e_llama8b_mi300x.md
.shark-ai
shortfin
andsharktank
llama3.1-8b-fp16-instruct
write edited config
. No longer needed after Make config.json consistent between shortfin and sharktank #487Added back
sentencepiece
as a requirement forsharktank
. Not having it causedexport_paged_llm_v1
to break when installing nightly:This was obfuscated when building from source, because
shortfin
includessentencepiece
inrequirements-tests.txt
.