fix(dsp): in `HFClientVLLM`, actually use `kwargs` in the payload instead of discarding them #719

remiconnesson · 2024-03-25T23:28:16Z

in dsp.modules.HFClientVLLM class the kwargs are supposed to be passed to the vllm openai endpoint (cf. the docs) but are discarded instead. This is related to the issue here : #718

In this PR, there are some modifications to the dsp.modules.HFClientVLLM class:

In the __init__ the kwargs were actually unused. In this PR those are attached to self. instead of being discarded.
The model of the payload was provided by the kwargs of _generate method, disregarding the value initialized in the __init__. This is fixed in this PR.
Two optional parameters were made mandatory (temperature and max_tokens) in the _generate method. Those were made mandatory because it was a direct kwargs["<parameter>"], which would raise KeyError. In this PR, those two dictionary access have been deleted.
Finally, still in the _generate method, the kwargs were never actually passed to the vllm endpoint. In this PR we fix this by unpacking the kwargs into the payload so that the extra arguments.

instead of the one from kwargs.

arnavsinghvi11 · 2024-04-01T18:13:34Z

Thanks for the additions @remiconnesson !

I believe this may not be needed as HFClientVLLM uses the kwargs through inheritance from HF which inherits from LM. Feel free to highlight any inconsistencies from this kwargs handling.

Josephrp

LGTM !

remiconnesson · 2024-04-01T20:06:15Z

@arnavsinghvi11

https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-completions-api

kwargs are necessary for guided completion

But from the link you show me, only a subset of the kwargs are used

Just the commit on setting model is not necessary but, this being said, I don't see any reason to discard all of the kwargs and not passing those to the llm call

arnavsinghvi11 · 2024-04-01T20:58:56Z

got it! could you also run ruff check . --fix-only which is currently failing tests? will merge after that. Thanks!

remiconnesson · 2024-04-01T21:17:36Z

@arnavsinghvi11 it's done :)

arnavsinghvi11 · 2024-04-01T21:57:29Z

Thanks @remiconnesson !

manjubsavanth · 2024-04-02T08:01:13Z

Hi @arnavsinghvi11 can you confirm if this commit has been released? will re-installing the DSPy package solve the kwargs issue with HFClientVLLM?

This does not seem to work with HFClientVLLM as of now and I get multiple output for a classification task after using config n=1 or top_p=1

arnavsinghvi11 · 2024-04-02T18:24:00Z

Hi @manjubsavanth , this should be resolved after pulling and reinstalling DSPy with this latest commit.

manjubsavanth · 2024-04-03T07:10:44Z

@arnavsinghvi11 thanks, I have installed dspy by pulling from git, getting the below error
AttributeError: 'HFClientVLLM' object has no attribute 'model'

AttributeError                            Traceback (most recent call last)
File <command-1178964926776796>, line 3
      1 import dspy
      2 lm = dspy.HFClientVLLM(model="mistralai/Mistral-7B-Instruct-v0.2", port=8000, url="http://0.0.0.0/")
----> 3 lm("My name is Robert")

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-9dfc2a8f-024c-4b74-a13e-e541ebac4e5c/lib/python3.10/site-packages/dsp/modules/hf.py:183, in HFModel.__call__(self, prompt, only_completed, return_sorted, **kwargs)
    180 if kwargs.get("n", 1) > 1 or kwargs.get("temperature", 0.0) > 0.1:
    181     kwargs["do_sample"] = True
--> 183 response = self.request(prompt, **kwargs)
    184 return [c["text"] for c in response["choices"]]

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-9dfc2a8f-024c-4b74-a13e-e541ebac4e5c/lib/python3.10/site-packages/dsp/modules/lm.py:26, in LM.request(self, prompt, **kwargs)
     25 def request(self, prompt, **kwargs):
---> 26     return self.basic_request(prompt, **kwargs)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-9dfc2a8f-024c-4b74-a13e-e541ebac4e5c/lib/python3.10/site-packages/dsp/modules/hf.py:140, in HFModel.basic_request(self, prompt, **kwargs)
    138 raw_kwargs = kwargs
    139 kwargs = {**self.kwargs, **kwargs}
--> 140 response = self._generate(prompt, **kwargs)
    142 history = {
    143     "prompt": prompt,
    144     "response": response,
    145     "kwargs": kwargs,
    146     "raw_kwargs": raw_kwargs,
    147 }
    148 self.history.append(history)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-9dfc2a8f-024c-4b74-a13e-e541ebac4e5c/lib/python3.10/site-packages/dsp/modules/hf_client.py:137, in HFClientVLLM._generate(self, prompt, **kwargs)
    133 def _generate(self, prompt, **kwargs):
    134     kwargs = {**self.kwargs, **kwargs}
    136     payload = {
--> 137         "model": self.model,
    138         "prompt": prompt,
    139         **kwargs,
    140     }
    143     # Round robin the urls.
    144     url = self.urls.pop(0)

AttributeError: 'HFClientVLLM' object has no attribute 'model'

remiconnesson · 2024-04-03T09:44:21Z

@manjubsavanth indeed, I pushed a fix here #762

manjubsavanth · 2024-04-03T11:51:48Z

Hi @remiconnesson I continue to get the error, I have installed the package using "%pip install git+https://github.com/stanfordnlp/dspy.git". Does it need a merge with main?

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-0c4c340f-f97e-4fdd-959d-7828164b8e57/lib/python3.10/site-packages/dsp/modules/hf_client.py:137, in HFClientVLLM._generate(self, prompt, **kwargs)
    133 def _generate(self, prompt, **kwargs):
    134     kwargs = {**self.kwargs, **kwargs}
    136     payload = {
--> 137         "model": self.model,
    138         "prompt": prompt,
    139         **kwargs,
    140     }
    143     # Round robin the urls.
    144     url = self.urls.pop(0)

AttributeError: 'HFClientVLLM' object has no attribute 'model'

remiconnesson · 2024-04-03T11:58:36Z

Hi @manjubsavanth the fixed is push but indeed is not yet merged with main

remiconnesson · 2024-04-03T12:01:40Z

@manjubsavanth

If you can't wait for the merge you can apply it yourself in the meantime:
It's dirty but should help you get going:

Open the following file
local_disk0/.ephemeral_nfs/envs/pythonEnv-0c4c340f-f97e-4fdd-959d-7828164b8e57/lib/python3.10/site-packages/dsp/modules/hf_client.py

(or wherever the new envs is on your machine now)

and apply the change you see there:
https://github.com/stanfordnlp/dspy/pull/762/files

manjubsavanth · 2024-04-04T06:44:59Z

Hi @remiconnesson now I get the below error. I built a wheel locally with the changes you had suggested.

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dsp/modules/hf_client.py:138, in HFClientVLLM._generate(self, prompt, **kwargs)
    133 def _generate(self, prompt, **kwargs):
    134     kwargs = {**self.kwargs, **kwargs}
    136     payload = {
    137         # "model": self.model,
--> 138         "model": kwargs["model"],
    139         "prompt": prompt,
    140         **kwargs,
    141     }
    144     # Round robin the urls.
    145     url = self.urls.pop(0)

KeyError: 'model'

remiconnesson added 3 commits March 26, 2024 00:10

fix(dsp): attach kwargs to HFClientVLLM instance

58affaa

fix(dsp): use the model declared in the constructor

9ead2e9

instead of the one from kwargs.

fix(dsp): Allow optional parameters to be optional

40dcd03

Josephrp approved these changes Apr 1, 2024

View reviewed changes

fix(dsp): Append kwargs to the payload for vllm

e74c361

remiconnesson force-pushed the hf-vllm-use-kwargs branch from c88d6e9 to e74c361 Compare April 1, 2024 21:16

arnavsinghvi11 merged commit 4ec6805 into stanfordnlp:main Apr 1, 2024
4 checks passed

remiconnesson mentioned this pull request Apr 3, 2024

hotfix(dsp): In HFClientVLLM get the model from the kwargs attached by the parent class #762

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(dsp): in `HFClientVLLM`, actually use `kwargs` in the payload instead of discarding them #719

fix(dsp): in `HFClientVLLM`, actually use `kwargs` in the payload instead of discarding them #719

remiconnesson commented Mar 25, 2024 •

edited

Loading

arnavsinghvi11 commented Apr 1, 2024

Josephrp left a comment

remiconnesson commented Apr 1, 2024 •

edited

Loading

arnavsinghvi11 commented Apr 1, 2024

remiconnesson commented Apr 1, 2024

arnavsinghvi11 commented Apr 1, 2024

manjubsavanth commented Apr 2, 2024 •

edited

Loading

arnavsinghvi11 commented Apr 2, 2024

manjubsavanth commented Apr 3, 2024

remiconnesson commented Apr 3, 2024

manjubsavanth commented Apr 3, 2024 •

edited

Loading

remiconnesson commented Apr 3, 2024

remiconnesson commented Apr 3, 2024

manjubsavanth commented Apr 4, 2024

fix(dsp): in HFClientVLLM, actually use kwargs in the payload instead of discarding them #719

fix(dsp): in HFClientVLLM, actually use kwargs in the payload instead of discarding them #719

Conversation

remiconnesson commented Mar 25, 2024 • edited Loading

arnavsinghvi11 commented Apr 1, 2024

Josephrp left a comment

Choose a reason for hiding this comment

remiconnesson commented Apr 1, 2024 • edited Loading

arnavsinghvi11 commented Apr 1, 2024

remiconnesson commented Apr 1, 2024

arnavsinghvi11 commented Apr 1, 2024

manjubsavanth commented Apr 2, 2024 • edited Loading

arnavsinghvi11 commented Apr 2, 2024

manjubsavanth commented Apr 3, 2024

remiconnesson commented Apr 3, 2024

manjubsavanth commented Apr 3, 2024 • edited Loading

remiconnesson commented Apr 3, 2024

remiconnesson commented Apr 3, 2024

manjubsavanth commented Apr 4, 2024

fix(dsp): in `HFClientVLLM`, actually use `kwargs` in the payload instead of discarding them #719

fix(dsp): in `HFClientVLLM`, actually use `kwargs` in the payload instead of discarding them #719

remiconnesson commented Mar 25, 2024 •

edited

Loading

remiconnesson commented Apr 1, 2024 •

edited

Loading

manjubsavanth commented Apr 2, 2024 •

edited

Loading

manjubsavanth commented Apr 3, 2024 •

edited

Loading