Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: NotImplementedError for CFG Logits Processor in VLLM Model #1276

Open
PierreLepagnol opened this issue Nov 21, 2024 · 2 comments
Open
Labels

Comments

@PierreLepagnol
Copy link

Describe the issue as clearly as possible:

I encountered an issue when attempting to use the generate.cfg function with a VLLM model. The code throws a NotImplementedError, indicating that the CFG Logits processor is not available for the VLLM class.

Steps/code to reproduce the bug:

from vllm import LLM, SamplingParams

llm = LLM(
    "neuralmagic/Llama-3.2-1B-Instruct-quantized.w8a8",
    enable_prefix_caching=True,
    block_size=64,
    max_num_batched_tokens=15000,
    gpu_memory_utilization=0.96,
    max_model_len=15000,
    use_v2_block_manager=True,
)

arithmetic_grammar = """
    ?start: expression

    ?expression: term (("+" | "-") term)*

    ?term: factor (("*" | "/") factor)*

    ?factor: NUMBER
           | "-" factor
           | "(" expression ")"

    %import common.NUMBER
"""

from outlines import generate, models

model = models.VLLM(llm)
generator = generate.cfg(model, arithmetic_grammar)
sampling_params = SamplingParams(temperature=0.3, top_p=0.2, max_tokens=20)

sequence = generator(
    "Alice had 4 apples and Bob ate 2. Write an expression for Alice's apples:",
    sampling_params=sampling_params,
)

Expected result:

I expected the code to generate a sequence based on the defined grammar using the `VLLM` model.

Error message:

Exception has occurred: NotImplementedError
The CFG Logits processor is not available for <class 'outlines.models.vllm.VLLM'>.
  File "/home/lepagnol/Documents/These/format-constrained-for-slu/vllm_test.py", line 30, in <module>
    generator = generate.cfg(model, arithmetic_grammar)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: The CFG Logits processor is not available for <class 'outlines.models.vllm.VLLM'>.

Outlines/Python version information:

Version information

``` aiohappyeyeballs==2.4.3 aiohttp==3.11.6 aiosignal==1.3.1 annotated-types==0.7.0 antlr4-python3-runtime==4.9.3 anyio==4.6.2.post1 asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1698341106958/work attrs==24.2.0 autocommand==2.2.2 backports.tarfile==1.2.0 certifi==2024.8.30 charset-normalizer==3.4.0 click==8.1.7 cloudpickle==3.1.0 cmake==3.31.0.1 comm @ file:///home/conda/feedstock_root/build_artifacts/comm_1710320294760/work compressed-tensors==0.8.0 datasets==3.1.0 debugpy @ file:///home/conda/feedstock_root/build_artifacts/debugpy_1731044888992/work decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work dill==0.3.8 diskcache==5.6.3 distro==1.9.0 einops==0.8.0 exceptiongroup @ file:///home/conda/feedstock_root/build_artifacts/exceptiongroup_1720869315914/work executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1725214404607/work fastapi==0.115.5 filelock==3.16.1 frozenlist==1.5.0 fsspec==2024.9.0 gguf==0.10.0 h11==0.14.0 httpcore==1.0.7 httptools==0.6.4 httpx==0.27.2 huggingface-hub==0.26.2 hydra-core==1.3.2 hydra-submitit-launcher==1.2.0 idna==3.10 importlib_metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1726082825846/work inflect==7.3.1 interegular==0.3.3 ipykernel @ file:///home/conda/feedstock_root/build_artifacts/ipykernel_1719845459717/work ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1729866374957/work jaraco.collections==5.1.0 jaraco.context==5.3.0 jaraco.functools==4.0.1 jaraco.text==3.12.1 jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1731317204262/work Jinja2==3.1.4 jiter==0.7.1 jiwer==3.0.5 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1726610684920/work jupyter_core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_core_1727163409502/work lark==1.2.2 llvmlite==0.43.0 lm-format-enforcer==0.10.9 MarkupSafe==3.0.2 matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1713250518406/work mistral_common==1.5.0 more-itertools==10.3.0 mpmath==1.3.0 msgspec==0.18.6 multidict==6.1.0 multiprocess==0.70.16 nest_asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1705850609492/work networkx==3.4.2 ninja==1.11.1.1 numba==0.60.0 numpy==1.26.4 omegaconf==2.3.0 openai==1.54.5 opencv-python-headless==4.10.0.84 outlines==0.0.46 packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1731802491770/work pandas==2.2.3 parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1712320355065/work partial-json-parser==0.2.1.1.post4 pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1706113125309/work pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work pillow==10.4.0 platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1726613481435/work prometheus-fastapi-instrumentator==7.0.0 prometheus_client==0.21.0 prompt_toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1727341649933/work propcache==0.2.0 protobuf==5.28.3 psutil @ file:///home/conda/feedstock_root/build_artifacts/psutil_1729847057810/work ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1609419310487/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl pure_eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1721585709575/work py-cpuinfo==9.0.0 pyairports==2.1.1 pyarrow==18.0.0 pycountry==24.6.1 pydantic==2.9.2 pydantic_core==2.23.4 pydot==3.0.2 Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1714846767233/work pyparsing==3.2.0 python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1731919281354/work python-dotenv==1.0.1 pytz==2024.2 PyYAML==6.0.2 pyzmq @ file:///home/conda/feedstock_root/build_artifacts/pyzmq_1728642254015/work RapidFuzz==3.10.1 referencing==0.35.1 regex==2024.11.6 requests==2.32.3 rpds-py==0.21.0 safetensors==0.4.5 sentencepiece==0.2.0 setuptools==75.5.0 setuptools-scm==8.1.0 six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work sniffio==1.3.1 stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work starlette==0.41.3 submitit==1.5.2 sympy==1.13.1 tiktoken==0.7.0 tokenizers==0.20.3 tomli==2.0.1 torch==2.5.1+cpu torchvision==0.20.1+cpu tornado @ file:///home/conda/feedstock_root/build_artifacts/tornado_1724956131631/work tqdm==4.67.0 traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1713535121073/work transformers==4.46.3 typeguard==4.3.0 typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1717802530399/work tzdata==2024.2 urllib3==2.2.3 uvicorn==0.32.0 uvloop==0.21.0 vllm==0.6.4.post2.dev67+g63f1fde2.cpu watchfiles==0.24.0 wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1704731205417/work websockets==14.1 wheel==0.45.0 xxhash==3.5.0 yarl==1.17.2 zipp @ file:///home/conda/feedstock_root/build_artifacts/zipp_1731262100163/work ```

Context for the issue:

No response

@PierreLepagnol
Copy link
Author

I'm not an expert but in the doc (https://dottxt-ai.github.io/outlines/latest/reference/models/vllm/) it's formally said :

This also works with generators built with generate.regex, generate.json, generate.cfg, generate.format and generate.choice.

@Tonybodo
Copy link

Getting the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants