Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.dll .so generation #39

Open
vaiju1981 opened this issue Feb 27, 2025 · 1 comment
Open

.dll .so generation #39

vaiju1981 opened this issue Feb 27, 2025 · 1 comment

Comments

@vaiju1981
Copy link

Hi I am currently trying to integrate llama.cpp with Java ( https://github.com/vaiju1981/java-llama.cpp/tree/b4689 ) this allows llamacpp server to run inside java.

I was wondering if there a way I can use llama-box instead of llama.cpp ( mostly due to fact that it has vision support ). How would I go about if that is possible.

@thxCode
Copy link
Collaborator

thxCode commented Feb 27, 2025

llama-box is an application that uses llama.cpp, so both the binder and application stand at the same level, one for the language-specific interface, and the other for the HTTP interface.

you can treat llama-box like Jetty or Tomcat in the Java domain.

I believe the binder can also implement the same VL logic, all you need is to get the right batch during llama_decode:

llama-box/llama-box/server.cpp

Lines 3868 to 3869 in 384ca12

qwen2vl_text_token_batch_wrapper batch_txt = qwen2vl_text_token_batch_wrapper((tokens.data() + j), n_eval, batch_txt_mrope_pos.data(), slot.id);
if (llama_decode(llm_ctx, batch_txt.batch)) {
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants