Update user docs for running `llm server` #676

stbaione · 2024-12-11T16:33:43Z

Description

Did a pass through and made updates + fixes to the user docs for e2e_llama8b_mi300x.md.

Update install instructions for shark-ai
Update nightly install instructions for shortfin and sharktank
Update paths for model artifacts to ensure they work with llama3.1-8b-fp16-instruct
Remove steps to write edited config. No longer needed after Make config.json consistent between shortfin and sharktank #487

Added back sentencepiece as a requirement for sharktank. Not having it caused export_paged_llm_v1 to break when installing nightly:

ModuleNotFoundError: No module named 'sentencepiece'

This was obfuscated when building from source, because shortfin includes sentencepiece in requirements-tests.txt.

Add back `sentencepiece` as requirement for `sharktank` to enable `export_paged_llm_v1`

ScottTodd · 2024-12-11T19:31:29Z

docs/shortfin/llm/user/e2e_llama8b_mi300x.md

+You can install either the `latest stable` version of shortfin by installing
+`shark-ai` or the `nightly` version directly:


You can just install shortfin or shortfin[apps] too. All the meta shark-ai package does is pin to matching versions of all packages.

Yeah, I felt like it was better branding to use shark-ai for that part. I can switch it to shortfin[apps] though. What do you think?

ScottTodd · 2024-12-11T19:36:51Z

sharktank/requirements.txt

 # Needed for newer gguf versions (TODO: remove when gguf package includes this)
-# sentencepiece>=0.1.98,<=0.2.0
+sentencepiece>=0.1.98,<=0.2.0


:/ looks like gguf==0.10.0 still only includes these deps:

Requires-Dist: numpy (>=1.17) Requires-Dist: pyyaml (>=5.1) Requires-Dist: tqdm (>=4.27)

despite the inclusion in requirements: https://github.com/ggerganov/llama.cpp/blob/1a31d0dc00ba946d448e16ecc915ce5e8355994e/gguf-py/pyproject.toml#L21-L26

I didn't see an upstream issue at a glance, but we should file one and follow up upstream instead of carrying around these downstream patches forever.

Started typing out an issue, but realized that the pyproject.toml linked above is included in v0.11.0, but not in v0.10.0.

Looks like v0.11.0 was released an hour ago. Thinking we could try an upgrade?

Yes let's try upgrading. The metadata is there in the .whl file now.

Requires-Dist: numpy (>=1.17) Requires-Dist: pyyaml (>=5.1) Requires-Dist: sentencepiece (>=0.1.98,<=0.2.0) Requires-Dist: tqdm (>=4.27)

Update user docs for running llm server,

8655451

Add back `sentencepiece` as requirement for `sharktank` to enable `export_paged_llm_v1`

stbaione requested a review from ScottTodd December 11, 2024 16:33

Merge branch 'main' into llm-user-docs-update

18ec96e

ScottTodd reviewed Dec 11, 2024

View reviewed changes

ScottTodd mentioned this pull request Dec 12, 2024

[sharktank] Depend on sentencepiece #687

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update user docs for running `llm server` #676

Update user docs for running `llm server` #676

stbaione commented Dec 11, 2024

ScottTodd Dec 11, 2024

stbaione Dec 12, 2024

ScottTodd Dec 11, 2024

stbaione Dec 12, 2024

ScottTodd Dec 12, 2024

		You can install either the `latest stable` version of shortfin by installing
		`shark-ai` or the `nightly` version directly:

Update user docs for running llm server #676

Are you sure you want to change the base?

Update user docs for running llm server #676

Conversation

stbaione commented Dec 11, 2024

Description

ScottTodd Dec 11, 2024

Choose a reason for hiding this comment

stbaione Dec 12, 2024

Choose a reason for hiding this comment

ScottTodd Dec 11, 2024

Choose a reason for hiding this comment

stbaione Dec 12, 2024

Choose a reason for hiding this comment

ScottTodd Dec 12, 2024

Choose a reason for hiding this comment

Update user docs for running `llm server` #676

Update user docs for running `llm server` #676