Skip to content

Commit

Permalink
Better API support (#68)
Browse files Browse the repository at this point in the history
* feat(cli): node llama cpp (#59)

* feat(cli): node-llama-cpp

* docs(troubleshooting: node-llama-cpp

* fix(node-llama-cpp): cli

* feat: beta version of node-llama-cpp

* ci: beta version

* fix: Markdown list

* fix: auto model tag

* fix: model settings

* docs: troubleshooting

* fix: node-llama-cpp beta deps

* feat(node-llama-cpp): update to the latest version & APIs, including function calling and json scheme

* feat(api): new api that integrates with node-llama-cpp@beta

* ci: require approve

* fix: bump node-llama-cpp@beta to latest

* fix: better errors
  • Loading branch information
ido-pluto authored May 10, 2024
1 parent cc54125 commit 5f079f8
Show file tree
Hide file tree
Showing 33 changed files with 983 additions and 868 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ jobs:
contents: write
issues: write
pull-requests: write
environment:
name: npm

steps:
- uses: actions/checkout@v3
Expand All @@ -39,7 +41,7 @@ jobs:
npm run generate-docs
- name: Release
if: github.ref == 'refs/heads/main'
if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/beta'
id: release-package
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand Down
43 changes: 34 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Make sure you have [Node.js](https://nodejs.org/en/) (**download current**) inst
```bash
npm install -g catai

catai install vicuna-7b-16k-q4_k_s
catai install llama3-8b-openhermes-dpo-q3_k_s
catai up
```

Expand Down Expand Up @@ -57,6 +57,7 @@ Commands:
active Show active model
remove|rm [options] [models...] Remove a model
uninstall Uninstall server and delete all models
node-llama-cpp|cpp [options] Node llama.cpp CLI - recompile node-llama-cpp binaries
help [command] display help for command
```

Expand Down Expand Up @@ -92,14 +93,6 @@ This package uses [node-llama-cpp](https://github.com/withcatai/node-llama-cpp)
- linux-ppc64le
- win32-x64-msvc

### Memory usage
Runs on most modern computers. Unless your computer is very very old, it should work.

According to [a llama.cpp discussion thread](https://github.com/ggerganov/llama.cpp/issues/13), here are the memory requirements:

- 7B => ~4 GB
- 13B => ~8 GB
- 30B => ~16 GB

### Good to know
- All download data will be downloaded at `~/catai` folder by default.
Expand All @@ -125,6 +118,38 @@ const data = await response.text();

For more information, please read the [API guide](https://github.com/withcatai/catai/blob/main/docs/api.md)

## Development API + Node-llama-cpp@beta integration

You can use the model with [node-llama-cpp@beta](https://github.com/withcatai/node-llama-cpp/pull/105)

CatAI enables you to easily manage the models and chat with them.

```ts
import {downloadModel, getModelPath} from 'catai';

// download the model, skip if you already have the model
await downloadModel(
"https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q2_K.gguf?download=true",
"llama3"
);

// get the model path with catai
const modelPath = getModelPath("llama3");

const llama = await getLlama();
const model = await llama.loadModel({
modelPath
});

const context = await model.createContext();
const session = new LlamaChatSession({
contextSequence: context.getSequence()
});

const a1 = await session.prompt("Hi there, how are you?");
console.log("AI: " + a1);
```

## Configuration

You can edit the configuration via the web ui.
Expand Down
4 changes: 0 additions & 4 deletions clients/catai/src/lib/Chat/Markdown.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,6 @@
list-style: auto !important;
}
li {
line-height: .5rem;
}
.copy-clipboard {
position: absolute;
right: 0;
Expand Down
9 changes: 8 additions & 1 deletion docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Enable you to chat with the model locally on your computer.
```ts
import {createChat} from 'catai';

const chat = await createChat();
const chat = await createChat(); // using the default model installed

const response = await catai.prompt('Write me 100 words story', token => {
progress.stdout.write(token);
Expand All @@ -20,6 +20,13 @@ const response = await catai.prompt('Write me 100 words story', token => {
console.log(`Total text length: ${response.length}`);
```

You can also specify the model you want to use:

```ts
import {createChat} from 'catai';
const chat = await createChat({model: "llama3"});
```

If you want to install the model on the fly, please read the [install-api guide](./install-api.md)

## Remote API
Expand Down
35 changes: 17 additions & 18 deletions docs/install-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,9 @@ You can install models on the fly using the `FetchModels` class.
import {FetchModels} from 'catai';

const allModels = await FetchModels.fetchModels();
const firstModel = Object.keys(allModels)[0];

const installModel = new FetchModels({
download: firstModel,
download: "https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q2_K.gguf?download=true",
latest: true,
model: {
settings: {
Expand All @@ -23,27 +22,27 @@ await installModel.startDownload();

After the download is finished, this model will be the active model.

## Configuration
## Using with node-llama-cpp@beta

You can change the active model by changing the `CatAIDB`
You can download the model and use it directly with node-llama-cpp@beta

```ts
import {CatAIDB} from 'catai';
import {getLlama, LlamaChatSession} from "node-llama-cpp";
import {getModelPath} from 'catai';
const modelPath = getModelPath("llama3");

CatAIDB.db.activeModel = Object.keys(CatAIDB.db.models)[0];

await CatAIDB.saveDB();
```

You also can change the model settings by changing the `CatAIDB`

```ts
import {CatAIDB} from 'catai';
const llama = await getLlama();
const model = await llama.loadModel({
modelPath
});

const selectedModel = CatAIDB.db.models[CatAIDB.db.activeModel];
selectedModel.settings.context = 4096;
const context = await model.createContext();
const session = new LlamaChatSession({
contextSequence: context.getSequence()
});

await CatAIDB.saveDB();
const a1 = await session.prompt("Hi there, how are you?");
console.log("AI: " + a1);
```

For extra information about the configuration, please read the [configuration guide](./configuration.md)
For more information on how to use the model, please refer to the [node-llama-cpp beta pull request](https://github.com/withcatai/node-llama-cpp/pull/105)
41 changes: 41 additions & 0 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Troubleshooting

Some common problems and solutions.


## I can't connect to the server

If the server is disconnected without any error, it's probably a problem with the llama.cpp binaries.

The solution is to recompile the binaries:
```bash
catai cpp
```

## How to change the download location?

You can configure the download location by changing the `CATAI_DIR` environment variable.

More environment variables configuration can be found in the [configuration](https://withcatai.github.io/catai/interfaces/_internal_.Config.html#CATAI_DIR)

## Cuda Support

In case you have a GPU that supports CUDA, but the server doesn't recognize it, you can try to install the CUDA toolkit,
and rebuild the binaries.

Rebuild the binaries with CUDA support:

```
catai cpp --cuda
```

In case of an error, check the cuda
troubleshooting [here](https://withcatai.github.io/node-llama-cpp/guide/CUDA#fix-the-failed-to-detect-a-default-cuda-architecture-build-error).

## Unsupported processor / Exit without error

In case you have an unsupported processor, you can try to rebuild the binaries.

```
catai cpp
```
Loading

0 comments on commit 5f079f8

Please sign in to comment.