Adding a llama.cpp LLM Component #1052

edlee123 · 2024-12-20T03:45:43Z

Description

I added a llama.cpp LLM OPEA component. Llama.cpp is a popular LLM inference library/server "with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud" written in pure C/C++.

The component code is written in llm.py, and is most similar to the existing code in llms/text-generation/ray_serve. I also referred to ollama, and tgi to try keep with conventions.

Please see the README.md provides instructions how to use it.

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)
Others (enhancement, documentation, validation, etc.)

Dependencies

The dependencies are similar to other llm components.

Tests

This was tested on CPU (laptop) with Phi3.5 mini 4k instruct. The Llama Cpp can use GPU as needed but didn't test it.

Signed-off-by: Ed Lee <[email protected]>

xiguiw · 2025-01-03T07:10:14Z

@edlee123
GenAIComps is under refactor - There are too much duplicated code.

Please wait for several days. And integrate it into new interface. Sorry for this.
But code refactor is helpful the new interface is simple. You just focus on LLＭ, no micro-service code needed.

comps/llms/text-generation/llamacpp/docker_compose_llm.yaml

comps/llms/text-generation/llamacpp/README.md

Signed-off-by: Ed Lee <[email protected]>

…lled image has specific tag. Signed-off-by: Ed Lee <[email protected]>

edlee123 · 2025-01-06T22:14:13Z

Hi @xiguiw

GenAIComps is under refactor - There are too much duplicated code.

No problem, which refactor branches should I wait for? I can wait for the refactoring to merge, and then I see how I can use the same approach.

Signed-off-by: Ed Lee <[email protected]>

comps/llms/text-generation/llamacpp/Dockerfile

comps/llms/text-generation/llamacpp/entrypoint.sh

…oint.sh Signed-off-by: Ed Lee <[email protected]>

for more information, see https://pre-commit.ci

xiguiw · 2025-01-13T03:39:24Z

Hi @xiguiw

GenAIComps is under refactor - There are too much duplicated code.

No problem, which refactor branches should I wait for? I can wait for the refactoring to merge, and then I see how I can use the same approach.

@edlee123

The LLM refactor code is merged.
Here is the code structure for your reference:

https://github.com/opea-project/GenAIComps/tree/main/comps/llms

.
├── deployment
│   ├── docker_compose
│   └── kubernetes
├── src
│   └── text-generation
           └── integrations

Foy your reference:

The docker-compose yaml file is put in deployment folder.
There is only one micro-service code for llm, opea_llm_microservice.py. For integration of each LLM services engine, it is located in integrations. Please refer to opea_llm_microservice.py and opea.py.
For multiple services/engines integrations, please can refer to https://github.com/opea-project/GenAIComps/tree/main/comps/embeddings/src/integrations

edlee123 · 2025-01-13T21:01:15Z

Thank you @xiguiw - would good next steps for this PR be:

Move the comps/llms/text-generation/llamacpp/docker_compose_llm.yaml file to comps/llms/deployment/docker_compose/text-generation_llamacpp.yaml (renaming it).
Test the new refactored opea llm microservice works with the above docker compose file.
Update the README.md in comps/llms/text-generation/llamacpp to use the above workflow?
Delete files from comps/llms/text-generation/llamacpp that are no longer needed.
Fix the last two github checks Check Online Document Building and Compose file and dockerfile path checking

Thank you for your guidance

First commit of llamacpp Opea component

397f7b8

Signed-off-by: Ed Lee <[email protected]>

edlee123 requested a review from lvliang-intel as a code owner December 20, 2024 03:45

edlee123 added 3 commits December 19, 2024 21:50

Removed unneeded requirements file

cb4f5e5

Signed-off-by: Ed Lee <[email protected]>

Merge branch 'main' into llamacpp

df3d943

Merge branch 'main' into llamacpp

8893f38

xiguiw reviewed Jan 3, 2025

View reviewed changes

comps/llms/text-generation/llamacpp/docker_compose_llm.yaml Outdated Show resolved Hide resolved

comps/llms/text-generation/llamacpp/README.md Show resolved Hide resolved

xiguiw requested a review from letonghan January 6, 2025 09:17

edlee123 added 5 commits January 6, 2025 15:38

Pin the llama.cpp server version, and fix small typo

2a48bae

Signed-off-by: Ed Lee <[email protected]>

Merge branch 'llamacpp' of github.com:edlee123/GenAIComps into llamacpp

644ecce

Update README.md to describe hardware support, and provide reference.

4e82152

Signed-off-by: Ed Lee <[email protected]>

Updated docker_compose_llm.yaml so that the llamacpp-server so the pu…

baf381d

…lled image has specific tag. Signed-off-by: Ed Lee <[email protected]>

Merge branch 'main' into llamacpp

7bab970

edlee123 added 3 commits January 7, 2025 08:48

Merge branch 'main' into llamacpp

e4f4b70

Small adjustments to README.md

9d7539d

Signed-off-by: Ed Lee <[email protected]>

Merge branch 'main' into llamacpp

2cf25e5

eero-t reviewed Jan 8, 2025

View reviewed changes

comps/llms/text-generation/llamacpp/Dockerfile Outdated Show resolved Hide resolved

comps/llms/text-generation/llamacpp/entrypoint.sh Outdated Show resolved Hide resolved

edlee123 and others added 4 commits January 10, 2025 13:13

This removes unneeded dependencies in the Dockerfile, unneeded entryp…

fd15ee7

…oint.sh Signed-off-by: Ed Lee <[email protected]>

Merge branch 'llamacpp' of github.com:edlee123/GenAIComps into llamacpp

666196c

Merge branch 'main' into llamacpp

104527a

[pre-commit.ci] auto fixes from pre-commit.com hooks

c931902

for more information, see https://pre-commit.ci

Merge branch 'main' into llamacpp

6b98403

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a llama.cpp LLM Component #1052

Adding a llama.cpp LLM Component #1052

edlee123 commented Dec 20, 2024 •

edited

Loading

xiguiw commented Jan 3, 2025

edlee123 commented Jan 6, 2025

xiguiw commented Jan 13, 2025

edlee123 commented Jan 13, 2025 •

edited

Loading

Adding a llama.cpp LLM Component #1052

Are you sure you want to change the base?

Adding a llama.cpp LLM Component #1052

Conversation

edlee123 commented Dec 20, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

xiguiw commented Jan 3, 2025

edlee123 commented Jan 6, 2025

xiguiw commented Jan 13, 2025

edlee123 commented Jan 13, 2025 • edited Loading

edlee123 commented Dec 20, 2024 •

edited

Loading

edlee123 commented Jan 13, 2025 •

edited

Loading