-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Issues: ggerganov/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
ggml : add ANE backend
help wanted
Extra attention is needed
research 🔬
#10453
opened Nov 22, 2024 by
ggerganov
Bug: 【CANN】ggml-cann/aclnn_ops.cpp:3007: GGML_ASSERT(n_dims == src0->ne[0]) failed
bug-unconfirmed
critical severity
Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
#10451
opened Nov 22, 2024 by
zyp2
Bug: Heavy throttling during token generation on Apple Silicon
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10444
opened Nov 21, 2024 by
Azirine
Bug: Flash Attention performs worse under ROCM
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10439
opened Nov 20, 2024 by
Mushoz
Bug: Severe Performance Degradation on Q4_0 CPU-only with MacOS / Apple Silicon M2, after PR#9921 / Version 4081
bug
Something isn't working
#10435
opened Nov 20, 2024 by
AndreasKunar
Why server slot's cache_prompt is false by default?
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10427
opened Nov 20, 2024 by
Nekotekina
Bug: SYCL builds >= b4069 fail to allocate SYCL0 buffer
bug-unconfirmed
critical severity
Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
#10421
opened Nov 20, 2024 by
0xDEADFED5
Bug: Vulkan vk::DeviceLostError with multithreaded environment
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10420
opened Nov 20, 2024 by
ddwkim
Bug: Build failure in master on Arch Linux with CUDA enabled
bug-unconfirmed
high severity
Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
#10414
opened Nov 19, 2024 by
momokrono
Bug: run llama.cpp failed with Vulkan-supported and quantized model in Android Termux
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10406
opened Nov 19, 2024 by
linxhome
Feature Request: Code Explanation Tutoria
enhancement
New feature or request
#10399
opened Nov 19, 2024 by
Tangzhongyi834
4 tasks done
Bug: Server hangs when number of threads used for decoding > number of CPUs it runs on
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10397
opened Nov 19, 2024 by
KevinRSX
Feature Request: [CANN] Use the RoPE operator provided by aclnn
enhancement
New feature or request
#10396
opened Nov 19, 2024 by
noemotiovon
4 tasks done
Qwen 32B: server breaks stream abruptly when above 9K context
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10393
opened Nov 18, 2024 by
JeroenAdam
Refactor: Allow adding both tokens and embeddings to
llama_batch
#10381
opened Nov 18, 2024 by
ngxson
Bug: flash-attn can't use
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10378
opened Nov 18, 2024 by
Tangzhongyi834
Feature Request: Apply LoRA adapters per-request
enhancement
New feature or request
#10377
opened Nov 18, 2024 by
ngxson
4 tasks done
Bug: No docs explain the value for cache-type-k/v
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10373
opened Nov 18, 2024 by
phazei
Bug: Objective-C++ Compilation Issues when running Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
swift build
or swift test
with Swift Package Manager
bug-unconfirmed
high severity
#10371
opened Nov 18, 2024 by
will-lumley
Bug: convert_hf_to_gguf bluescreens windows with very large models
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10365
opened Nov 17, 2024 by
candre23
ggml : reintegrate the AMX backend into the CPU backend
refactoring
Refactoring
#10359
opened Nov 17, 2024 by
ggerganov
Bug: rope-scale and rope-scaling parameters not being parsed in llama.cpp server
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10355
opened Nov 17, 2024 by
henryclw
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.