-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Gracefully handle huggingface hub http error
#12571
opened Jan 30, 2025 by
ywang96
Loading…
Fix for attention layers to remain unquantized during moe_wn16 quant
#12570
opened Jan 30, 2025 by
srikanthsrnvs
Loading…
[V1][Log] Add max request concurrency log to V1
ready
ONLY add when PR is ready to merge/full CI is needed
#12569
opened Jan 30, 2025 by
mgoin
Loading…
[Misc] fix typo: add missing space in lora adapter error message
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#12564
opened Jan 29, 2025 by
Beim
Loading…
[Feature] Fix guided decoding blocking bitmask memcpy
performance
Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
#12563
opened Jan 29, 2025 by
xpbowler
Loading…
[CPU][PPC] Updated torch, torchvision, torchaudio dependencies
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12555
opened Jan 29, 2025 by
npanpaliya
Loading…
[VLM] Merged multi-modal processor for InternVL-based models
documentation
Improvements or additions to documentation
#12553
opened Jan 29, 2025 by
DarkLight1337
Loading…
[Misc] O3 compilation and Spec Decoding are not compatible
#12551
opened Jan 29, 2025 by
NickLucche
Loading…
Move requirements into their own directory
ci/build
documentation
Improvements or additions to documentation
#12547
opened Jan 29, 2025 by
hmellor
Loading…
[Bugfix] Fix 'ModuleNotFoundError: No module named 'intel_extension_for_pytorch'' for --tensor-parallel-size more than 1
#12546
opened Jan 29, 2025 by
Akashcodes732
Loading…
[Bugfix][Spec Decode] fix: update logits processor for MQA scoring
#12537
opened Jan 29, 2025 by
llsj14
Loading…
[Kernel] Use self.kv_cache and forward_context.attn_metadata in Attention.forward
#12536
opened Jan 29, 2025 by
heheda12345
Loading…
[WIP][AMD][Kernel][Quantization] Add fp8 and int8 support for Triton FAv2 kernel
documentation
Improvements or additions to documentation
[RFC][vllm-API] Support tokenizer registry for customized tokenizer in vLLM
frontend
#12518
opened Jan 28, 2025 by
youngkent
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.