Skip to content

Issues: HabanaAI/vllm-fork

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Feature]: Multi-Node Serving
#654 opened Dec 19, 2024 by ppatel-eng
1 task done
[Bug]: Device Type HPU is not supported for torch.generator() API bug Something isn't working
#627 opened Dec 12, 2024 by nageshdn
1 task done
[Bug]: Cannot Run Qwen2 Embedding Model on Gaudi bug Something isn't working
#583 opened Dec 4, 2024 by rvoleti89
1 task done
[Doc]: Checkout static tags in all release note instructions documentation Improvements or additions to documentation
#568 opened Nov 30, 2024 by rofinn
1 task done
[Feature]: Models Trained on Gaudi Do Not Work
#511 opened Nov 16, 2024 by gouki510
1 task done
[Bug]: I am not able to start the vllm container with llama 3.1 70b bug Something isn't working
#492 opened Nov 13, 2024 by pranjalst
1 task done
[Bug]: unable to inference large context length bug Something isn't working
#483 opened Nov 11, 2024 by pranjalst
1 task done
[Bug]: the generated text on BFloat16 is not as good as that on Float32. bug Something isn't working
#443 opened Oct 29, 2024 by ccrhx4
1 task done
[Bug]: MQLLMEngine dies after a period of inactivity bug Something isn't working
#416 opened Oct 23, 2024 by Xaenalt
1 task done
[RFC]: change VLLM_DECODE_BLOCK_BUCKET_* design to fit small AND large batch size at one warmup intel Issues or PRs submitted by Intel
#328 opened Sep 24, 2024 by ccrhx4
1 task done
[Usage]: The TP improvement is not as expectation intel Issues or PRs submitted by Intel
#274 opened Sep 12, 2024 by JunxiChhen
[Misc]: issue with loading weights from safetensors files external Issues or PRs submitted by external users
#211 opened Aug 28, 2024 by huijjj
[Feature]: support pipeline parallelism inference in vllm intel Issues or PRs submitted by Intel stale
#205 opened Aug 27, 2024 by Zjq9409
[Feature]: Compile warmup take too long intel Issues or PRs submitted by Intel stale
#201 opened Aug 26, 2024 by Zjq9409
[Bug]: benchmark_latency.py cannot exit when using tp bug Something isn't working intel Issues or PRs submitted by Intel
#197 opened Aug 21, 2024 by JunxiChhen
[Usage]: vllm can't run qwen 32B inference external Issues or PRs submitted by external users
#193 opened Aug 17, 2024 by kunger97
ProTip! Follow long discussions with comments:>50.