-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Installation requirements #89
Comments
Okay, we will update these packages in our next release. Regarding quantizations, we now support multiple quantization methods(Qx_k and IQ4_XS) . What else would you like? |
I have a system with 192GB DRAM and 48GB VRAM (2x 3090). Would it be able to handle 128k context with the specs? Would it be able to handle Q5_K_M or Q6_K_M? Also I can only set max_new_tokens in local_chat, not the ktransformers server, and I can't set the total context size anywhere. |
Having this issue as well, not being able to set --max_new_tokens in the container breaks downstream projects that require longer output lengths. |
Yes, we supported Q5_K_M and Q6_K_M. About setting max_new_tokens in container, sorry for the inconvenience that this is not supported now. If you’re building from source, you can modify the |
Hi,
I tried to install ktransformers on a clean install of Linux Mint 22 (based on Ubuntu 24.04), and there are a few things that I had to add:
Please update the pip dependencies.
Are there any plans to increase the number of quants supported for Deepseek-Coder-V2-Instruct-0724?
The text was updated successfully, but these errors were encountered: