Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a llama.cpp LLM Component #1052

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

edlee123
Copy link

@edlee123 edlee123 commented Dec 20, 2024

Description

I added a llama.cpp LLM OPEA component. Llama.cpp is a popular LLM inference library/server "with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud" written in pure C/C++.

The component code is written in llm.py, and is most similar to the existing code in llms/text-generation/ray_serve. I also referred to ollama, and tgi to try keep with conventions.

Please see the README.md provides instructions how to use it.

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

The dependencies are similar to other llm components.

Tests

This was tested on CPU (laptop) with Phi3.5 mini 4k instruct. The Llama Cpp can use GPU as needed but didn't test it.

@xiguiw
Copy link
Collaborator

xiguiw commented Jan 3, 2025

@edlee123
GenAIComps is under refactor - There are too much duplicated code.

Please wait for several days. And integrate it into new interface. Sorry for this.
But code refactor is helpful the new interface is simple. You just focus on LLM, no micro-service code needed.

@xiguiw xiguiw requested a review from letonghan January 6, 2025 09:17
@edlee123
Copy link
Author

edlee123 commented Jan 6, 2025

Hi @xiguiw

GenAIComps is under refactor - There are too much duplicated code.

No problem, which refactor branches should I wait for? I can wait for the refactoring to merge, and then I see how I can use the same approach.

@xiguiw
Copy link
Collaborator

xiguiw commented Jan 13, 2025

Hi @xiguiw

GenAIComps is under refactor - There are too much duplicated code.

No problem, which refactor branches should I wait for? I can wait for the refactoring to merge, and then I see how I can use the same approach.

@edlee123

The LLM refactor code is merged.
Here is the code structure for your reference:

https://github.com/opea-project/GenAIComps/tree/main/comps/llms

.
├── deployment
│   ├── docker_compose
│   └── kubernetes
├── src
│   └── text-generation
           └── integrations

Foy your reference:

  1. The docker-compose yaml file is put in deployment folder.
  2. There is only one micro-service code for llm, opea_llm_microservice.py. For integration of each LLM services engine, it is located in integrations. Please refer to opea_llm_microservice.py and opea.py.
  3. For multiple services/engines integrations, please can refer to https://github.com/opea-project/GenAIComps/tree/main/comps/embeddings/src/integrations

@edlee123
Copy link
Author

edlee123 commented Jan 13, 2025

Thank you @xiguiw - would good next steps for this PR be:

  1. Move the comps/llms/text-generation/llamacpp/docker_compose_llm.yaml file to comps/llms/deployment/docker_compose/text-generation_llamacpp.yaml (renaming it).
  2. Test the new refactored opea llm microservice works with the above docker compose file.
  3. Update the README.md in comps/llms/text-generation/llamacpp to use the above workflow?
  4. Delete files from comps/llms/text-generation/llamacpp that are no longer needed.
  5. Fix the last two github checks Check Online Document Building and Compose file and dockerfile path checking

Thank you for your guidance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants