-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a llama.cpp LLM Component #1052
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ed Lee <[email protected]>
@edlee123 Please wait for several days. And integrate it into new interface. Sorry for this. |
Signed-off-by: Ed Lee <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
…lled image has specific tag. Signed-off-by: Ed Lee <[email protected]>
Hi @xiguiw
No problem, which refactor branches should I wait for? I can wait for the refactoring to merge, and then I see how I can use the same approach. |
…oint.sh Signed-off-by: Ed Lee <[email protected]>
for more information, see https://pre-commit.ci
The LLM refactor code is merged. https://github.com/opea-project/GenAIComps/tree/main/comps/llms
Foy your reference:
|
Thank you @xiguiw - would good next steps for this PR be:
Thank you for your guidance |
Description
I added a llama.cpp LLM OPEA component. Llama.cpp is a popular LLM inference library/server "with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud" written in pure C/C++.
The component code is written in llm.py, and is most similar to the existing code in
llms/text-generation/ray_serve
. I also referred to ollama, and tgi to try keep with conventions.Please see the README.md provides instructions how to use it.
Issues
List the issue or RFC link this PR is working on. If there is no such link, please mark it as
n/a
.Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
The dependencies are similar to other llm components.
Tests
This was tested on CPU (laptop) with Phi3.5 mini 4k instruct. The Llama Cpp can use GPU as needed but didn't test it.