🦙 Llama3 on TGI - Jetstream Pytorch #90

tengomucho · 2024-09-09T11:53:47Z

What does this PR do?

Add support for Llama 3 (Llama 3.1 will come later), fixing some issues with some model settings.
Also tests have been rearranged, so slow tests run with bigger models.

This will ensure that tokenizer_config.json is loaded if needed.

This change will allow Llama3 models to be loaded.

HuggingFaceDocBuilderDev · 2024-09-09T12:01:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This is a workaround to avoid a core dump observed when testing on TinyLLama-v0 model. It should allow to prevent other similar problems later. This allows to add again the basic test (not slow) that will run on PRs and check Jetstream/Pytorch.

mfuntowicz · 2024-09-10T12:16:04Z

text-generation-inference/server/text_generation_server/jetstream_pt_support/engine_loader.py

@@ -34,10 +32,10 @@ def load_llama_model_info(model_path: str) -> Any:
    return model_info


-def load_model_info(model_path: str) -> Any:
-    config = AutoConfig.from_pretrained(model_path)  # For now only Llama 2 is supported
+def load_model_info(config: PretrainedConfig) -> Any:


Suggested change

def load_model_info(config: PretrainedConfig) -> Any:

def load_model_info(config: "PretrainedConfig") -> Any:

mfuntowicz · 2024-09-10T12:16:50Z

text-generation-inference/server/text_generation_server/jetstream_pt_support/engine_loader.py

@@ -11,19 +11,17 @@
    QuantizationConfig,
 )
 from loguru import logger
-from transformers import AutoConfig
+from transformers import AutoConfig, PretrainedConfig


Suggested change

from transformers import AutoConfig, PretrainedConfig

from typing import TYPE_CHECKING

if TYPE_CHECKING:

from transformers import PretrainedConfig

from transformers import AutoConfig

mfuntowicz

LGTM 👍🏻

tengomucho added 4 commits September 9, 2024 11:11

fix(engine_loader): correct n_reps and cache_shape settings

26fdbf8

feat(tokenizer): donwload all json files when fetching model

1d63b07

This will ensure that tokenizer_config.json is loaded if needed.

feat(jetstream pt): relax llama compatibility requirements

d858713

This change will allow Llama3 models to be loaded.

test(jetstream pt): move Llama2-7b test to runslow/nightly

8ce7762

tengomucho added 2 commits September 9, 2024 12:04

test(jetstream pt): add test showing support of Llama3-8B

b323794

tengomucho force-pushed the llama3-on-jetstream-pt-tgi branch from 9a7fdea to b323794 Compare September 9, 2024 12:04

tengomucho marked this pull request as ready for review September 9, 2024 12:56

tengomucho requested a review from mfuntowicz September 9, 2024 13:00

mfuntowicz reviewed Sep 10, 2024

View reviewed changes

mfuntowicz approved these changes Sep 10, 2024

View reviewed changes

tengomucho added 3 commits September 10, 2024 12:35

review: fix imports for type checking

6857b5f

fix: correct type hint

c9f11da

fix(pyproject): correct jetstream git revision

525231a

tengomucho merged commit b25e973 into main Sep 10, 2024
4 checks passed

tengomucho deleted the llama3-on-jetstream-pt-tgi branch September 10, 2024 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🦙 Llama3 on TGI - Jetstream Pytorch #90

🦙 Llama3 on TGI - Jetstream Pytorch #90

tengomucho commented Sep 9, 2024

HuggingFaceDocBuilderDev commented Sep 9, 2024

mfuntowicz Sep 10, 2024

mfuntowicz Sep 10, 2024

mfuntowicz left a comment

	def load_model_info(config: PretrainedConfig) -> Any:
	def load_model_info(config: "PretrainedConfig") -> Any:

-from transformers import AutoConfig, PretrainedConfig
+from typing import TYPE_CHECKING
+if TYPE_CHECKING:
+    from transformers import PretrainedConfig
+from transformers import AutoConfig

🦙 Llama3 on TGI - Jetstream Pytorch #90

🦙 Llama3 on TGI - Jetstream Pytorch #90

Conversation

tengomucho commented Sep 9, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Sep 9, 2024

mfuntowicz Sep 10, 2024

Choose a reason for hiding this comment

mfuntowicz Sep 10, 2024

Choose a reason for hiding this comment

mfuntowicz left a comment

Choose a reason for hiding this comment