Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLLM Sampled Tokens #311

Merged
merged 4 commits into from
Jan 31, 2025
Merged

VLLM Sampled Tokens #311

merged 4 commits into from
Jan 31, 2025

Conversation

AdamBelfki3
Copy link
Collaborator

  • Fixed the narrowing of the logits module output to reflect accurately the batch_groups.
with vllm_gpt2.trace(temperature=0.0, top_p=1.0, max_tokens=3) as tracer:
    with tracer.invoke(
        [
            "Madison Square Garden is located in the city of", 
            "The Eiffel Tower is located in the city of",
        ]
    ):

        logits_1 = nnsight.list().save()

        for ii in range(3):
            logits_1.append(vllm_gpt2.logits.output)
            vllm_gpt2.logits.next()

    with tracer.invoke("Rome is the capital city of"):
        logits_2 = nnsight.list().save()

        for ii in range(5):
            logits_2.append(vllm_gpt2.logits.output)
            vllm_gpt2.logits.next()

assert all(logit.shape[0] == 2 for logit in logits_1)
assert all(logit.shape[0] == 1 for logit in logits_2)

Previous to this fix, the logit output inside each invoker would contain the logits from all the prompts passed in the entire trace.

  • Added traceability for sampled tokens. vLLM provides functionality to configure how each sequence samples its next token. Here's an example of how you can trace that operation with the nnsight VLLM wrapper.
with vllm_gpt2.trace("Madison Square Garden is located in the city of", temperature=0.8, top_p=0.95, max_tokens=3) as tracer:
    samples = nnsight.list().save()
    for ii in range(3):
        samples.append(vllm_gpt2.samples.output)
        vllm_gpt2.samples.next()

print(samples)
>>> [tensor([16940]), tensor([319]), tensor([262])]

@JadenFiotto-Kaufman JadenFiotto-Kaufman merged commit 5697d40 into 0.4 Jan 31, 2025
1 check passed
@JadenFiotto-Kaufman JadenFiotto-Kaufman deleted the vllm-sample branch January 31, 2025 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants